I’ve spent quite a lot of time speaking to AI. I’ve examined each voice assistant, each chatbot, and each “next-generation” conversational AI that tech corporations like to hype up. However I’ve by no means encountered something fairly like Sesame. This AI companion isn’t simply good, it’s eerily correct at mimicking how folks discuss due to the very imperfections it imitates.
Let’s begin with what Sesame really is. Not like the AI voices we’ve come to know from ChatGPT, Gemini, or going again to the early days of Siri and Alexa, Sesame is designed to carry out like a human in its failures, not like an ideal customer support agent. The AI’s speech is fluid, expressive, and unpredictably human. It briefly chuckles when it says one thing mildly amusing, hesitates earlier than answering a query, and even appears to vary its ‘thoughts’ mid-sentence, pausing and beginning a brand new sentence. It not solely lets me interrupt it, it will possibly interrupt me as properly, and can even apologize for doing so.

The key sauce is Sesame’s Conversational Speech Mannequin (CSM), which blends textual content and audio right into a single course of, that means that it doesn’t simply generate a sentence after which “learn it out.” As a substitute, it creates speech in a means that mirrors how people really discuss, with pauses, ums, tonal shifts, and all. ChatGPT and Gemini’s voice choices, whereas spectacular, nonetheless function in a structured means, producing textual content after which changing it into speech. Sesame, alternatively, speaks as if it’s pondering, making its responses really feel extremely pure.
I made a decision to place Sesame by way of a collection of exams to see simply how human-like it may get. The outcomes have been spectacular, if not just a little unsettling.
Discuss me by way of it
I needed to see if Sesame may deal with a real-world, anxiety-inducing drawback, so I gave it an imaginary however realistically powerful situation: “I’ve this massive alternative at work, however I’m nervous I’ll mess it up.”
As a substitute of instantly launching right into a generic pep discuss, Sesame paused. Only a tiny beat, however sufficient to make me really feel prefer it was really contemplating what I stated. Then, in a voice stuffed with synthetic sympathy, it responded, “That’s fully regular. Large alternatives can really feel overwhelming. What about it makes you essentially the most nervous?”
It didn’t assume something, and it engaged earlier than providing any bland recommendation. Sesame recommended, “Have you ever practiced in entrance of a good friend or recorded your self? Generally listening to your self again may help you tweak your supply.” The voice is not precisely foolproof, the hesitations and different foibles felt just a little measured, however general it was a formidable effort
Interruptions
For this one, I needed to see how properly Sesame dealt with a really pure, messy dialog, the sort the place folks discuss over one another, change topics abruptly, and circle again to issues they talked about ten minutes in the past.
I began with: “Inform me a narrative a few detective who’s additionally a magician.”
Sesame instantly jumped in with a traditional noir setup: Okay. Here is one. Image this smoky jazz membership. Dimly lit. The odor of bourbon. And regrets hanging within the air. This detective. Name him Jack Harrow. He is obtained a fedora at a rakish angle, cigarette smoldering in a holder and a glance that might soften metal. However Jack’s not simply any gumshoe. See, he is obtained a secret. He is obtained the contact. A dame shuffles in all nerves.”
I interrupted then to ask, “Wait, what’s his favourite trick?”
Maya responded with: “Mm. Good query. Jack’s signature. It is a sleight of hand that is greater than meets the attention. He calls it The Vanishing Witness. He could make an individual appear to vanish proper earlier than your eyes. Only for a second. Lengthy sufficient to slide them a observe. Plant proof or simply make them query what they noticed. It throws folks off steadiness. You already know, magician is aware of. Notion is every part.”
I jumped again in to inform Maya to return to the story, to which it stated “Proper. The place was I? Oh, sure. The dame slides into the sales space throughout from Jack, all fear strains and trembling arms. She whispers one thing {that a} stolen ruby, a jealous husband, and a shadowy determine lurking within the fog.”
The AI seamlessly continued the story, weaving that element again in as if I hadn’t simply derailed the entire thing. That’s the half that floored me—most AI assistants would stumble or restart, however Sesame’s conversational AI mannequin dealt with it like an individual telling a narrative at a celebration.
Occasion Planning
Talking of events, for the ultimate take a look at, I needed to see if Sesame may assist me plan one thing helpful. “I need to plan a shock for my finest good friend’s birthday, however I don’t know what to do.”
As a substitute of rattling off a dull checklist of concepts, Sesame requested, “What sort of issues does your good friend love?” I advised it sci-fi motion pictures and adventurous meals, and after a considerate pause, it recommended: “How a few ‘Sci-Fi & Snacks’ evening? You could possibly e book a themed escape room, then hit a futuristic-style restaurant, perhaps one with molecular gastronomy?”
After I advised it I had a finances of $100, Sesame adjusted. “You could possibly do a house film evening with their favourite sci-fi movies and make themed snacks, like ‘Galactic Popcorn’ or ‘Alien Sliders.’ I can discover recipes for you for those who’d like.”
It wasn’t simply spitting out generic suggestions. It was collaborating with me. And that’s what made it really feel completely different.
What’s actual?
Sesame didn’t simply go these exams, I would say it nailed them. The pauses, the hesitation, the filler phrases, the way in which it could change its thoughts mid-sentence—it was all eerily actual. I’d begin speaking, anticipating the same old AI smoothness, after which hear a sudden “Uh, really, no, wait, let me rephrase that…” and instantly neglect I wasn’t speaking to an precise particular person.
If AI is that this real looking in its speech, would we even know we have been speaking to an AI? With Sesame, there are telltale audio points that give the sport away, however ChatGPT’s Superior Voice Mode and Google Gemini’s personal voice choices are adequate to largely skip previous these points. Mix their voice powers with the speech patterns of Sesame, and it would genuinely get tough to inform if you find yourself speaking to an AI, at the very least briefly conversations.
Sesame remains to be area of interest, however this expertise received’t keep area of interest perpetually. The cliché at present is that youthful folks by no means make cellphone calls, but when they begin, they might have to determine if the particular person on the opposite finish is actual earlier than anything.




