Work, writing

Voice Interaction

As phones have transitioned to smart phones, our personal technology has graduated from conduits between people to a more sophisticated breed that allows for – even invites – direct control. In tandem, people are getting rid of voicemail, making fewer phone calls, and texting more. In one vein, this seems like a more truncated, efficient behavior, but it also implies greater intimacy with the device.

We’re also growing to expect the similar level of control we have over our phones to expand to the devices of our environment. The “smart home” and “connected” objects are commanded with our phones for the time being. Contrary to the shift in phone use, control of these devices that is buried in a growing library of apps is not efficient.

The technological response to surfacing quick control over these smart objects is the use of voice interfaces. The Xbox’s Kinect allows for voice control of your Xbox apps and access to media. The Xfinity remote control makes “change the channel to HBO” possible. Apple’s Siri, Microsoft’s Cortana, Amazon’s Alexa, and the Google Now services are all serious attempts at broadening voice control to access many services.

While speech-to-text recognition has largely improved, the voice controlled services themselves still lack in the sophistication that people presume exists when communicating through a nuanced medium as speech. Even if this level of sophistication is attained, and the services understand and respond exactly as we expect them to, the challenge of intimacy remains.

When common interaction with phones shifted from calls to text as the interfaces allowed more direct (read: intimate) control, we’ve created this controversial-yet-accepted balance of interacting with people directly and multitasking with our pocket computers. Voice interaction necessitates a more public display of that human computer interaction. One that is so uncomfortable, directly inhibits its use. Think of the times you have used your voice input on a phone: public settings, private settings with people around, or solitary settings?

Although we may not be able to out-design social mores, we can take the first challenge—that of accuracy, intuitive use, and predictable outcome—to the whiteboard and to the APIs.


Design Team:

Kristen Kersh • Niamh Parsely • Rob Brogan

Standard
Star Trek Holodeck
Work, writing

Design Considerations for Virtual Reality

Preamble:
Yes, much hype. Much much hype.
Yes, I’m always skeptical, and I’m assuming that VR headsets (e.g. Oculus) will take a few iterations, and price points to catch on. Now even a few years into it, wearables are still getting mediocre traction. At best, Apple has people wearing them for social status or fashion. Nevertheless, new technology is deserving of design consideration even more than existing, common devices. They need to be nurtured, and “done right” in order to have a longer life ahead.

What follows are a few things I would keep in mind if I found myself in a position to design for Virtual Reality. Perhaps with more exposure to VR, I can add to this list in the future.


Sharing the room
Others that are not wearing the headset have no insight to the VR experience; unlike a TV, which can be a shared experience. Devices will either have to become more affordable so that everyone can wear them at the same time, or the solitary device should provide some external feedback to others in the room; such as an outward-facing display that mirrors a 2D version of the virtual experience, distinct audio signals (for the room, not the wearer), or as some currently offer: an optional feed that displays on a TV/monitor.

Accessories to support and enhance
Accessories can enhance the experience, further immersing you into the virtual reality by giving you a great approximation of bodily control. These can range from the more necessary, to nice additions.

The ability to turn in place with ease (and not falling into real world objects) is probably the most important and can be solved with a basic swivel chair or the more expensive 360° treadmills.

Oculus accessories

In concert with existing wrist wearables, or custom-made wristbands, the VR headset would no longer need to be the main point of interaction (click, tap, or toggle). Using accelerometers and Bluetooth that are already included in any fitness wearable, one could wave an arm in front of them and have the action mimicked in VR. Or similarly, a shake or a tap on the wrist could replace the need to tap a button on the headset for making selections.

Keep things out of frame (move the eye)
The same principle that applies to photography, painting, or any kind of visual medium: you want the eye to move across the canvas. In this case, you want heads to turn. Succeeding at this influence are short films that have a rich and beautiful environment, but also play between primary and secondary subjects. At times, both are not within the same gaze and you must turn to see either subject.

This should be used in moderation however, as you can easily tire a VR participant with too many subjects in different directions, and also risk a poor experience that leaves observers feeling they might have missed out on parts of the story because they were forced to follow one subject while another of equal importance remains out of view.

Sound quality is as important as image quality
A truly immersive experience relies on tricking your senses. A well-crafted story also relies on directed attention. Audio quality aides both of these by bringing the observer into the virtual world with realistic ambient sound, and the ability to subtly distinguish voice will help people grasp if there’s a character standing next to them that they need to turn and face, or if the speech is coming from an omnipresent narrator.

Prompt to enable Do Not Disturb when starting the VR
This is a short one, but nothing ruins a virtual experience like a pesky notification pushing its way into view. Before starting a VR experience, there should be some reminder or prompt to enable Do Not Disturb mode for the phone. More aggressively, VR software could just disable notifications, but I prefer to let users make the choice.

Subtitles should remain fixed, detached from video movement
Another specific point is that layered content, like subtitles, should be fixed to an easily legible portion of the screen. In one demo, they were out of view, below the general plane of vision. Although moving around and exploring the setting is a hallmark of VR, some visual elements should be fixed or represented “out” of the virtual space – another plane, or layer, if you will.

Standard
Musings, writing

Crème of Abstraction Layers

Evolution of content publishing online

When I slow down to look at my interaction with most web sites, I notice an incongruity for accomplishing the same end goal: publishing.

While different sites offer different levels of sophistication, I’ve noticed that creating or editing content on the modern web is bubbling-up closer to the surface. I liken this to term as used in Computer Science: abstraction.

At first, you had to write binary that worked directly with the processor. Then they created a language that allowed you to write logic gates which were converted to binary. The higher you get, the more programming becomes natural to humans.

Nick Nelson, Web Developer

As Nick sums it up, there is a pretty deep (and technical) background to programming that few of us think of today — even the programmers. Even though a well-versed developer that works in an Object Oriented language might know the logic behind the code, we have long since been removed from considering logic gates and binary code.

Translating this to web development, the abstraction layers could go as far as the binary code, but the fundamental difference between software and what we predominantly see on the web seems to start with HTML. Taking human steps back from HTML, by my count, we are just now seeing mainstream implementation of a fourth layer of abstraction.

facebook

facebook

Layer One

HTML and other languages

It used to be that you had to write everything in the language our browsers speak. Yes, we still build websites like this, but you don’t have to know this language to write a blog or update your status. I consider HTML, CSS, and other browser languages to be the first or bottom layer of abstraction on the web. Kids used to learn HTML if they wanted their MySpace to look a certain way. Later, WordPress says, “no more!” Enter the second layer of abstraction.

Layer Two

Admin panels and WYSIWYG

WYSIWYG editor tool for the Blogger CMS

WYSIWYG editor tool for the Blogger CMS

WYSIWYG (what you see is what you get) has been around before the internet, letting you select a different font style or change your margins, colors, and other preferences. Its implementation in Content Management Systems (CMS) brings about the second layer of abstraction. I’m sure you can count plain text input somewhere in earlier CMS platforms, but this is the more common method of creating content on the Web. Blogger might have been the first prolific example (above), but I haven’t spent enough time Googling to tell you for sure.

Aside:
To be true to the definition of abstraction this comparison should only be made from language-to-language. In that regard, languages like SASS, LESS, and the like are another layer of abstraction on top of CSS. I’m using abstraction liberally to talk about the mode of interaction you have with a computer when creating content online. In that regard, SASS and CSS are in the same bucket of “manually writing out instructions for the browser.”

An important element of this second layer of abstraction is not only the WYSIWYG, but its placement within an administrator’s section of the website. On Blogger, WordPress, or even the relatively modern Tumblr, you must sign in and access a different side of the website to enter new content and publish.

Tumblr’s Content Entry, circa 2013

Tumblr’s Content Entry, circa 2013

What makes Tumblr interesting is that the primary experience of viewing and interacting with other posts within the community takes place in the same logged-in state / administrator view.

Other services fall elsewhere in the spectrum of a definitive edit mode and read mode. WordPress for example has a completely different experience in the edit mode or administrator side of the site, whereas Flickr was one of the first to blur the line and display the same interface for editing as reading — with minor differences when clicking on things.

Wordpress Content Entry

WordPress Content Entry


Flickr Edit in Place Fields. Konigi, 2008

Flickr Edit in Place Fields. Konigi, 2008

It seems that in the development of a new content platform, there’s a defining choice whether to embrace the Content Management System (CMS) or to try and hide it as much as possible, creating the illusion that your draft could just as well be live, published content. This design decision is what carries some products from the Second Layer of abstraction and the Third Layer of Abstraction, where creating and reading content begin to merge.

Layer Three

Always logged-in + squishy CMS

Flickr, circa 2013

Flickr, circa 2013

As you can see above, Flickr has made quite a few changes over the years and I think it’s an excellent example of a third layer of abstraction to creating content online. Yes, there’s a smaller gap between this layer and the second than there was between the first two, but it’s distinct enough to deserve recognition.

In the Flickr example, people are still interacting with a CMS and they are logged in as an “administrator” of their content. What is significant however, is that our identities online have become more solidified and with a more liberal use of browser cookies, we are almost always identified when walking in to a website we commonly use. For example, WordPress, despite its many improvements, will still ask you to sign in to access the administrator part of your blog; whereas Facebook, Flickr, Medium, and many others will remember you, and what’s more: the main mode of interacting with those sites (communities really) is within the logged-in state.

As we lean toward an always logged-in state by default, the CMS necessarily has merged with the published content. Even when interacting with the CMS, it has become standard to do the composing or editing in the same place you’re viewing other content. When making a Facebook status, your browser doesn’t ask you to leave the newsfeed. When publishing a tweet, your browser no longer requires you to refresh the page to view that content. Overall, there a higher sophistication of Front-End web development being employed that makes these CMS interactions quite “squishy” compared to the very distinct moments you will have with a WordPress CMS and reading the blog, for example.

Sitting here in 2015, this doesn’t sound like much of a revelation. My apologies if I didn’t warn you ahead of time, but I don’t see myself as a visionary. I just think its important to document what we see.

Layer Four

Collaborative content

If I am to follow this winding definition of creating content and getting further away from complying with computers to get things done, then the last layer as I see it must be collaborative documents. Hear me out:

  1. Writing code – You’re using HTML, SASS, anything that’s meant for a browser and not a human.
  2. WYSIWYG / Hard CMS – You’re filling in text boxes, clicking formatting buttons, previewing, publishing, and then going somewhere else to see how it looks.
  3. Always logged-in / Squishy CMS – You don’t have to go anywhere else when finished creating or editing content. The line between browsing the web and composing has been blurred, but there is still a very strong line between you and your readers: the save/post/publish button.
  4. Collaborative content – In this state, the CMS is the viewing platform and composing platform at the same time. There’s no line between browsing and composing, nor is there a line between you and your readers:
Google Docs

Google Docs

I think where this notion of the Fourth Layer feels a bit forced is that it’s not a typical use case. Collaborative tools such as Google Docs, Dropbox, Box, and the like are associated with professional use only, and even in a professional setting they are not the norm.

What’s interesting to me however, is a hypothetical type of social media where that line between author and reader is selectively removed. Let’s take Facebook as an easy target. Imagine if you didn’t have to click the Post button on a status.

Oh wow Rob, that would make my life 9,000 times easier!

Yeah, I thought as much. It might even do more harm than good.

Side-note: isn’t there some publication that uses data Facebook has on what people draft as a status update versus what people actually publish?

I think it would be fun in some safe spaces, such as a curated group of your best friends where you could post content live and anyone that happens to be on the page at the same time can be drawn into your activity and then instantly (or simultaneously) begin to respond/react/build on what you’re putting out there.

Facebook

Okay, so let’s back away from the Facebook example a bit. I’m getting very specific just to try to explore what using the web would feel like if we managed to abstract ourselves just a bit more from the already ‘squishy’ CMS. Perhaps there will always be a line, a very minimum confirmation moment when an author does acknowledge something will be born into the Internet or not.

I think this depends on the concept of the web for most people. If you imagine it’s more akin to a book or newsletter than a dinner conversation or phone call, then yes, there will always be an interaction with the machine no matter how minimal. If you’re of the latter opinion however, then maybe at some point all lines will dissolve and we interact with the web as we do in person – maybe that leads to more explaining, and less editing, but that’s a whole other can of worms.


PS

I didn’t want to get too technical while exploring these different ways of creating content online, but I’d like to acknowledge that these layers of abstraction do not imply that we’re detaching from machines, markup languages, or programming of any sort. If anything, a greater layer of abstraction requires more sophisticated code to support such an elegant interface on the outside.

Good design should not aspire to rendering a complicated system into a seamless one; on the contrary, I hope we continue to focus our attention to the seams and learn how to best mold them to fit our needs.*

We used to log in to create blog posts and that was a necessity for security and identifying the author. Now we are logged in everywhere, for social reasons, for our own sense of digital identity.

Some Links:
http://en.wikipedia.org/wiki/Abstraction_(computer_science)
http://en.wikipedia.org/wiki/Abstraction_layer
http://en.wikipedia.org/wiki/WYSIWYG
http://konigi.com/interface/flickr-edit-place-fields
* I’d love to take credit for such an intelligent-sounding stance on design, but I first read about it here: Matthew Chalmers (2003)
Standard
Work, writing

Problem solving

There’s some discussion around the office, mostly among Interaction Designers, about the “Invisible User Interface.” Here’s an excerpt from The Best Interface is No Interface by Golden Krishna on The Verge. I readily agreed with almost everything he says …until reading the article that I share below.

As a criticism to our obsession with apps and interfaces (I’m certainly guilty), I think his point of view is refreshing. It strikes at something that should be discussed. Golden Krishna identifies a symptom of lazy design, and dare I say, kowtowing to less-than-savvy clients that are prepared to give you $1M to design an app.

An honest scenario

Some institution or company comes to a design agency with a problem.
Usually it boils down to a basic problem: We need more people to sign up for our service, or we want people to use our service more, and the classic we want people to buy our things instead of our competitors’.

The design agency has been designing apps and websites for years. The fact that a business even approaches a design agency implies that the business owner or otherwise important stakeholder has a solution in mind: an app, a web site, an interface.

The design agency will “take a step back”, carefully rephrase the business problem to their client. They’ll brainstorm and consider many solutions. At the end of the day, the unspoken understanding is that the design agency knows how to make apps and websites, and the business person came to the agency because that’s what they want.

Long story short, both parties end up jumping to the conclusion that an interface is the solution to the problem.

Slow down.

Designers are problem solvers.
You might have a title like visual designer, graphic designer, experience designer, interface designer, interaction designer… and that first word in your title pushes you to keep making the sort of things you always make. My greatest personal and professional challenge is to acknowledge the second word of these (often silly) titles. Living up to being a Designer means considering everything, and not jumping to the familiar toolbox to fix or improve something.

Side-note: This is why I was so enamored by Service Design that Fjord champions. Unfortunately, it’s less tangible and must be difficult to sell, because this type of thinking is still in the minority of their portfolio.

The point.

I meant to just drop a link in here and sprinkle in a pull-quote from an article that I liked. I’m eager to explore where I really stand between the ideas of Invisible Interface and seamfull experiences, but I’m still quite fresh on the topic. For now, here’s the link I came here to share:

No to NoUI – Timo Arnall

Standard
Reblog

Most interesting (recent) read

I haven’t posted here much, and that’s basically because I’m lazy. I’m still here, though!

Quickly now, I’d like to share an article that – in my opinion – has a lot of meat, and all of it is interesting, if you’re a designer.

Chinese Mobile App UI Trends

By Dan Grover

Some highlights include:
Chinese culture doesn’t make a big deal of meeting strangers nearby through social apps.
CAPTCHA utilized on login screens (not just signup flows).
People really do use QR codes!
Moments – Just scroll to this part. I really dig the philosophy.

Standard
Musings

Facebook Paper’s Pinhole Browsing

Read on Medium

Introduction

The following comes from my response to a company email (below) that asked designers for their opinion of FB Paper. I have not spent so much time as to give a critique of all aspects of the app. There are things I’d like to say about post creation, browsing profiles, and the new user onboarding for example, but simply didn’t take the time to go there. This is a hot topic for the moment, and I might continue to write/think about it, but there are a lot of intelligent people sharing diverse opinions out there. I think this can suffice for me.

The Prompt

TO ALL:
I used Paper for about 30 minutes tonight and felt there were some interesting interactions but overall was a bit frustrated. Definitely felt the “hook pain” the author here writes of… What do you guys think?

My Response

The design podcast, On The Grid, talked about their first impressions of Paper in this episode.

Opinions seem pretty polarized. A lot of people love it because it is agreeably much more polished than the regular functional Facebook app. I think if they had replaced their primary app with this one, there would be an uproar; but as a supplemental experience people, seem to like this new lens of the news feed (and other stuff?).

I’m not so kind in my opinion. I had high hopes of Facebook waning enough to go down a slippery slope of MySpacey death, but this app seems to appeal to the masses initially. As mentioned in the podcast and other sources, Facebook’s intent was to slow down our consumption of content with the hope that we pay more attention to each post. I believe these are smart guys, and they undoubtably understand their users better than I do, but I just can’t fathom slowing down Facebook without also reducing the amount of content.

Matas hopes that you’ll flip through slowly. “You really want people to spend a little bit of time with it and appreciate that content,” Matas says, “almost like when you go to a museum and you spend a little bit of time with each thing.” The Verge

I think that’s easier said than done. By only showing two whole thumbnails at a time, I am instead frustrated by the pinhole scope of content that I can browse. Perhaps they’re designing for the future, a vision of Facebook with much more interesting content, but the present feed I get from my connections is a 95:5 crap-to-interesting ratio. Slowing down and smelling the digital roses is not what I want from Facebook. I want to quickly skim frequently updated, vast amounts of content, until I land on an interesting picture, link, or juicy argument to read. Much like trying Windows phone for a week, I felt like my hands were tied. I wanted to zoom out.

This approach might work best for the other editorial sections they’ve vaguely collected (‘tech, culture, cute, etc.’) where content and curation is better. My complaint on this front is the lack of context. Looking at the Tech tab for example, what should I expect to find? Articles from TheNextWeb, NYTimes Tech, Engadget, A List Apart, The Verge? There’s quite a difference in the quality of writing and topics covered between sources. I don’t know how each of these channels are curated, how or if they’re connected to my account, and have no ability to customize them. At least with Flipboard, you can customize your content sources (note: I don’t use that app either).

The one thing I did like is vertical the swipe navigation. It’s nice that I don’t have to stretch my thumb across the screen to hit a specific region that has the link or CTA. I can hold my phone in one hand and a broad gesture moves me up or down in the hierarchy of channel, thumbnails, article preview, article detail. I agree the horizontal motion would get tiring. Maybe if they moved that row of thumbnails to the top of the screen it would be more comfortable, but I don’t like this browsing mode in the first place.

— whew —

Sorry for all the negativity.

I prefer ease of use, and am quickly skeptical idealistic presentations.

This app hinders my personal browsing preference of Facebook content, and does not seem to achieve the ideal they put forth. It might be a nice RSS reader, but then it would just be yet another RSS reader.

Note: This is also a Medium post

Standard
Musings

Auto-punctuation and syntactic processing

Once in a while, when I might want to construct a clever piece of SMS text-art, I can be annoyed by the automatic placement of the period (full stop) after entering space two times. We must design for the most common use cases however, and on the whole, I do like this auto-fill logic. The little things are often what delight users the most.

Being too liberal with a well-intended shortcut or assumption can be a UX calamity, but I believe that with some additional linguistic information (and a fair amount of user testing), the double-tap auto-punctuation could be extended to questions with a fair amount of success.

Present functionality

Let’s keep in mind that at present, this feature applies a period any time the space button is hit twice in a row. It could interrupt a sentence, or easily mislabel a question or exclamation. It’s currently up to the user to add alternative punctuation before continuing, or to erase the auto-punctuation and correct it.

Proposed functionality

Imagine double-tapping the space bar and iOS would predictively produce proper punctuation — I couldn’t resist the alliteration. This would be informed with linguist programing about content (question words) and context (companion words that differentiate statement from question).

Do you know how to do this? versus You know how to do this.
Reason: Presence of auxiliary “do” in Do*know. Note that the statesman also has the word “do,” but only as an infinitive (to*).

English can be tricky with its question semantics, and this solution would require localized code for different input languages, unlike the global period insertion.

Limitations, Advantages

For English usage, the limitations are easily apparent. It may require too much precious processing to scan an entire string of text for this kind of semantic context. For some languages it might be pointless if an opening punctuation is used – such as the Spanish ¿?. It would be interesting to survey Spanish users to see how prevalent this convention is during SMS or other mobile communication. My hunch is that like many English vowels, these opening question marks are usually dropped for brevity.

Some languages lend themselves to very “easy” semantic processing; so much so that one might think “why aren’t we doing this automatically?” To reference Spanish again (the only other language I feel qualified to write about), circumstance words are supposed to be spelled differently in the context of a question.

What/how is (she/he/it) like?
¿Cómo es?
What/how (she/he/it) is like.
Como es.
Interrogative circumstance words (what, where, etc.) have an accent, but do not in statements.

This is just a potential shortcut, and there are many question sentences without one of the “five Ws,” but it’s an example of logic that can circumvent contextual rules. Some languages even have question words (Japanese and Mandarin come to mind), but they also tend to not use question marks as far as I know.

This is just a few minutes of brainstorming, but if done properly, it could pretty useful, couldn’t it.

What do you think.
See what I did there.

Standard