SAMSUNG

Building multimodal skills for the smart virtual assistant Bixby

My Role

From 2019 to 2020, I shipped 7 Bixby capsules for over 2 million Chinese users, through which users can get things done through speech and touch across mobile devices and smart speakers. I worked with designated PMs for each capsule to prioritize the use cases and supported engineers in the implementation process.

What I Did

- Creating flows
- Writing Dialogue
- Lofi & high-fi interfaces
- Building with design systems
- Design QA
- Tracking user feedback and metrics
- Iteration

Background

Bixby capsules are third-party capabilities of Samsung's virtual assistant Bixby. How powerful Bixby is depends on what tasks it can help users perform. Since the launch of Bixby in Chinese market, our team has been been integrating the third-party services Chinese people use daily into Bixby platform. The capsules I designed includes Tencent News, Sina Sports, Ximalaya FM, Wechat, Youdao Dictionary, Flight manager. The following case study will peak into how the Flight Manager Capsule was built.

Design Goal

Making a capsule for Flight Manger is to enable users to find flight information efficiently with voice. The visual interaction in the native app needs to be converted to multi-modal interaction in Bixby.

Define must-work use cases

The goal was not to make Bixby assist all the tasks users could complete on its native app, so the first thing I did was to narrow down the scope of the capsule with the product manager. We identified the must-work use cases that were important for users among all the features of the Flight Manager app and that were efficient through voice interface.

They are:
1. Check flight status
2. Search flights with departure and arrival places

Unpack the use cases

After defining the use cases, the specific must-work utterances were listed for each scenario, elaborated with what Bixby would need (inputs) in order to produce the desired results (outputs).

Create conversation flows to illustrate all the paths that can be taken

Then it came to the core of the design process, creating the back-and-forth conversation between the user and Bixby to make the user complete the task in the most efficient way. I created the flow to cover all the possible branches where the conversation could head towards, and example dialogues for users and Bixby, e.g. when one keyword is missing from the user's request how should Bixby ask for input.

Design user interfaces that provide simple and the most relevant information

Users get both visual and voice information when they interact with Bixby on mobile phones. All of Bixby's possible responses should have corresponding visual interfaces. As Bixby enable users to perform key tasks of different services quickly within Bixby, the visual interfaces of those services should be consistent and the content on each interface should be minimal, only relevant to the current task. I chose appropriate components and layouts from Bixby Library to fit the information hierarchy of this capsule and prototyped from wireframing to high-fi mock-ups.

Design for Hands-off mode

Hands-off mode comes in when users interact with Bixby on smart speakers (without screen) and use voice to activate Bixby on mobile phones (we assume users can only communicate with voice now). In hands-off mode, Bixby's spoken dialog has to be more robust while the screen is more minimal.

For the use case - search flights with departure and arrival places, when it shows multiple results, Bixby's response in hands-on mode is voice response "we found those results from A to B" and a list of results on screen. But in hands-off mode, the voice response should read : the number of flights found + detailed result of a flight + guidance of switching to next flight.