Episodes · S2 E41
The Accidental Algorithm | Humans of AI Crossover with Writer's Melisa Russak
Key takeaways
- Melisa Russak built what turned out to be K-means clustering without knowing the algorithm existed. As a high-school math teacher in China, she wrote a classification system to recognize her students’ handwritten Chinese characters — only later discovering, in her words, that “people are working already more than one hundred years on this problem.”
- Russak says she chose to study Chinese alongside mathematics precisely because it was “maximally different” — “a completely different system” with no alphabet that pushed her “completely out of your comfort zone.” She wanted learning she could see compound day by day, unlike pure math where “you can spend ten hours on thinking about a math problem and have no results.”
- Russak rediscovered embeddings the same accidental way. Trying to build software that tracked her internet activity to answer “who am I?,” she had to represent both text and images so a computer could compare them. Her team “spent a lot of time trying to design those features ourselves” — before, as she puts it years later, “it’s all about embeddings… you just need to learn good embeddings.”
- Her advice to researchers: frame a problem yourself before reading the papers. “Once you start reading papers, you will converge to… how they framed this problem,” Russak warns, “and it’s very difficult to escape from that box once you are in the box.” Her accidental discoveries happened because she didn’t yet know the “right” way to think.
- At Writer, Russak frames the no-customer-data policy as a feature, not a wall. “We don’t use customer data. So how do I develop a system without data?” The team’s answer was to train a model to generate the data itself: “I would never come up with this if not given those constraints.”
- Russak puts the “vibes test” above benchmarks. A benchmark, she argues, “is like a cherry picked use case”; the missing piece is going to a human and asking whether they actually like using the model in production. “We always say that the VIBES test is the most important. After you satisfy all of those benchmarks, you go and do the VIBES check.”
Frequently asked questions
- Who is the guest, and is this a Chain of Thought interview?
- The guest is Melisa Russak, a lead research scientist at Writer. This is a crossover episode — Chain of Thought is re-sharing an episode of Writer’s own podcast, Humans of AI. Conor Bronsdon records only a short intro and outro framing the swap; he does not conduct the interview. The Humans of AI host, Elora Weaver, narrates Russak’s story, so the analytical and connective passages between Russak’s own words are Weaver’s, not the guest’s.
- How did Melisa Russak “accidentally” invent machine learning algorithms?
- Russak describes doing it twice, without knowing the field existed. First, as a math teacher in China, she built a system to classify her students’ handwritten Chinese characters — which turned out to be K-means clustering. Later, building software to track her own internet activity and learn something about herself, she had to represent and cluster text and images, reinventing the approach now known as embeddings. In her telling, she only discovered afterward that researchers had long been working on the same problems.
- What does Russak mean by the “vibes test”?
- Russak argues that benchmarks are essentially cherry-picked use cases, so passing them isn’t enough. The missing piece, she says, is time-consuming and human: you give the model to a real person, have them use it in production, and ask whether they actually like talking to it. “We always say that the VIBES test is the most important,” she says — after a model satisfies the benchmarks, the team does the vibes check.
- Why does Russak see Writer’s no-customer-data rule as an advantage?
- Russak says Writer made a deliberate choice not to use customer data, which at first felt like an impossible constraint: “How do I develop a system without data?” Her team’s answer was to train a model to generate the data — “I would never come up with this if not given those constraints.” She treats the constraint as an “excellent constraint” rather than a limitation: something that forced an idea she would not have reached otherwise.
- What is Russak’s warning about building on third-party model APIs?
- Russak cautions that a startup relying on an external API is exposed if the provider changes the underlying model: “Your entire business is in pieces,” she says, “because everything that you created so far on top of it, it stops existing.” Writer’s counter-approach, as she describes it, is to understand its own models deeply through extensive testing — including how each one behaves under quantization and distillation — so the team knows how to design a system on top of the model rather than optimizing for benchmark scores alone.
Show notes
This week, we're doing something special and sharing an episode from another podcast we love: The Humans of AI by our friends at Writer. We're huge fans of their work, and you might remember Writer's CEO, May Habib, from the inaugural episode of our own show.
From The Humans of AI:
Learn how Melisa Russak, lead research scientist at WRITER, stumbled upon fundamental machine learning algorithms, completely unaware of existing research — twice. Her story reveals the power of approaching problems with fresh eyes and the innovative breakthroughs that can occur when constraints become catalysts for creativity.
Melisa explores the intersection of curiosity-driven research, accidental discovery, and systematic innovation, offering valuable insights into how WRITER is pushing the boundaries of enterprise AI. Tune in to learn how her journey from a math teacher in China to a pioneer in AI research illuminates the future of technological advancement.
Follow the hosts
Follow Atin
Follow Conor
Follow Vikram
Follow Yash
Follow Today's Guest(s)
Check out Writer’s YouTube channel to watch the full interviews.
Learn more about WRITER at writer.com.
Follow Melisa on LinkedIn
Follow May on LinkedIn
Check out Galileo
Try Galileo
Transcript
45 segmentsSpeaker 0:00 Hey, everyone. Chain of Thought host, Connor Bronson here. We are switching things up a bit this week and showcasing an incredible episode from another podcast that we think you'll love, Humans of AI podcast, which is produced by our friends over at Ryder. As some of you may know, we're big fans of the work they're doing. We actually had their CEO and co founder, May Habib, on our inaugural episode of Chain of Thought last year, and we definitely encourage you to dive into our back catalog to check out that episode as well. The episode we're sharing today is a fascinating conversation with one of Ryder's AI research scientists, Melissa Russo. Her story is a powerful reminder that sometimes the biggest breakthroughs come from the perfect combination of curiosity
Speaker 0:41 and naivete because as Melissa discovered, sometimes the best way to innovate is to have no idea what you're doing. It's a thought provoking lesson, if you enjoy it, we highly recommend adding Humans of AI to your podcast feed. We'll be back next week with our regularly scheduled programming as we're joined by Ashwarya Srinivasan. But for now, I'll let the team over at Humans of AI take it away.
Speaker 1:10 Of course, at this point, have no notion of AI, right? You never came across machine learning, this phrase. You see text, right? Text is simple. I mean, like, relatively simple, but you also see pictures, right? So this is kind of what we were trying to achieve. That's Melissa Rusak and what she's describing, trying to figure out how to make computers understand pictures and words,
Speaker 1:43 well, that's machine learning. Except she didn't know that yet. She was just a math teacher in China tinkering with a problem that fascinated her, completely unaware that she was reinventing algorithms that had been studied for decades. Which raises a question that's been nagging at me since I first heard Melissa's story. How many times do we independently arrive at the same ideas?
Speaker 2:11 How often do we think we're being original when we're actually walking a path that's already been carved? I'm Elora Weaver. Welcome to Humans of AI. Today, we're telling the story of a woman who accidentally became a machine learning pioneer twice and what that tells us about the nature of discovery itself. Our story begins not in Silicon Valley or MIT, but in Chengdu,
Speaker 2:42 the bustling heart of Sichuan province, where ancient temples cast shadows on glass towers, towers, and where a young woman named Melissa was studying mathematics, the very language that powers artificial intelligence. But here's the thing, she was restless. Mathematics is purely abstract. So So when I studied mathematics, I was discovering that a part of me is not developing.
Speaker 3:08 Like in mathematics, I have a feeling like there is this thing like you can spend ten hours on thinking about a math problem and have no results. You can go to Chinese and you can spend those ten hours learning Chinese sounds, right? So you're to see the progress. What she was describing was the mathematician's dilemma, working in a realm of pure abstraction
Speaker 3:28 where breakthroughs can take years, where you might spend decades on a problem that leads nowhere. She craved something more tangible, more immediate. She wanted to see her learning compound visibly day by day. So, she made what might seem like an irrational choice. She decided to study Chinese alongside math, not because it made practical sense, but because it felt like the missing piece.
Speaker 3:58 So I just chose something maximally different from mathematics and that was Chinese. I need to admit that Chinese because it's challenging, right? It's a completely different system, like everything you know from phonetic system. So you don't have an alphabet, right? Completely out of your comfort zone. What Melissa didn't realize then was that she was training herself in something that would become essential to her future work,
Speaker 4:21 pattern recognition across completely different systems. The human brain that can switch between abstract mathematical proofs and tonal Chinese characters, that's exactly the kind of brain that can see patterns in data that others might miss. After graduation, she became a high school math teacher, And it was there watching her students struggle with handwritten Chinese characters that something clicked. I had a hobby, like this is the time when I discovered programming.
Speaker 4:52 So I started from ActionScript and Flash because that was how can I facilitate my students to learn faster, to learn better, to memorize better? She wanted to build something that could recognize these handwritten characters automatically. A classification system, she called it. Here's the thing that gives me chills about this story. Melissa was essentially trying to solve the same problem that researchers at universities
Speaker 5:16 around the world were tackling with sophisticated machine learning algorithms. Except she had no idea that's what she was doing. Of course, at this point you have no no show in the eye, right? You never came across machine learning, this phrase? She was like someone trying to reinvent the wheel, not knowing that wheels exist. Except in her case, she actually succeeded.
Speaker 5:42 So this is when it all started. I think that was my first data science project, my first data science job that really pushed me into discovery. Only later did she discover that what she had built was essentially K means clustering, a fundamental machine learning algorithm that had been around for decades. Of course, you come up with the algorithm that is not perfect, then you discover that actually people are working already more than one hundred years on this problem and there are improvements to this. So it's like, you upgrade.
Speaker 6:12 So that was amazing. She didn't just stumble onto one established algorithm. She did it again. Melissa's second accidental invention happened when she started wondering about memory and self knowledge. She wanted to build software that could track her internet activity and tell her something about herself that she didn't already know. Maybe you had this problem like when you wake up in the morning and you have this empty blank page, right? Like you're asking questions like wanted to do today, right? Even before the first coughing, what I wanted to achieve today, who am I?
Speaker 6:58 To solve this, she needed to figure out how to represent both text and images in a way that a computer could understand and compare. She needed to group similar content together to find patterns. And of course, if you try to do that, you naturally come into machine learning. This is exactly machine learning. Because you see text, right? Text is simple. I mean, like relatively simple, but you also see pictures, right? So how do you represent pictures?
Speaker 7:24 What she was describing, learning good representations of data, is what we now call embeddings. And then clustering that data to find patterns, well, that's the foundation of modern AI systems. That's what powers everything from recommendation algorithms to large language models. Right now, of course, like after those years in machine learning, would say like, it's all about embeddings, right? You just need to learn good embeddings.
Speaker 7:51 So you need to have a good encoder model. But back then, she was just a person with curious question about self knowledge, working with a team of linguists, trying to build something that had never been built before. We actually spent a lot of time trying to design those features ourselves, those embeddings, and trying to come up with the first system. She was reinventing
Speaker 8:12 neural networks, embeddings, clustering algorithms, the entire foundation of modern AI, because she needed them to answer a simple question. Who am I? There's something happening in Melissa's story that goes beyond just the coincidence of rediscovering algorithms. She stumbled onto something fundamental about how discovery works. Actually, this is a piece of advice that I give.
Speaker 8:49 Like, if you have a topic, before you start working on a topic, think about how you would frame it yourself. Because once you start reading papers, you will converge to what they actually how they framed this problem. And it's very difficult to escape from that box once you are in the box. This is the paradox of knowledge: The more we know about how others have solved a problem,
Speaker 9:11 the harder it becomes to see new solutions. Melissa's accidental discoveries happened precisely because she didn't know the right way to think about these problems. But there's something even deeper here. Melissa's current project, the one she's working on now, takes this question of accidental discovery into even more philosophical territory. I do have an even wider idea.
Speaker 9:39 So imagine that you collect all artifacts that you see, all of pictures, like even sound, everything that you can collect, even conversation with other human beings. So this is your input, right, as a human being. And then try to train an LLM on this input and try to ask the LLM, what's my next action? She wants to train an AI on everything she experiences, every conversation,
Speaker 10:03 every image, every piece of text, and then see if it can predict what she'll do next. I would love to train an alarm on the entire input. And then I would love to check to what extent I'm random, to what extent it can predict what I want to do next. Like checking, do I have free will? Think about what she's proposing here. If an AI trained on all of your experiences can predict your next action,
Speaker 10:30 what does that say about free will? Are we just very sophisticated algorithms ourselves following patterns we're not even aware of? Usually how it's framed is you have free will if your next action is not fully determined by your history, like the entire input that you get. It's the ultimate version of her original question, who am I? Taken to its logical extreme
Speaker 10:56 and it connects directly back to her accidental discoveries. If our thoughts and innovations are just the inevitable result of our inputs and experiences, then maybe Melissa's accidental algorithms weren't accidents at all. Maybe they were the only possible outcome of her particular combination of mathematical training, linguistic curiosity, and teaching experience.
Speaker 11:30 Building enterprise grade AI shouldn't be complicated. It should just work the right way. At writer.com, we don't do everything. We do one thing. We build enterprise AI that unites business and IT. Business teams? Build your own AI agents. No code required. IT teams? Manage just one platform, not a plethora of point solutions. Ryder.com creates AI tools that are safe, scalable,
Speaker 11:57 and smarter every time you use them. That's why Accenture, Qualcomm, Vanguard, and hundreds more aren't just doing enterprise AI the right way. They're doing it the Rider way. Book your demo at rider.com/demo. At writer, where Melissa now works as a research scientist, she's discovered something remarkable. The same conditions that led to her accidental breakthroughs in China are happening again,
Speaker 12:33 but this time it's by design. Excellent thing about this place is constraints. At the very beginning, in Writer, what we emphasize, we don't use customer data. So how do I develop a system without data? And at first you can be angry, right? I can't create a solution. And then you think about this, Oh, this is an excellent constraint. Right? Most AI companies today are built on hoarding vast amounts of user data.
Speaker 13:01 Ryder took the opposite approach. No customer data, period. To most people in the industry, this would seem like trying to build a car without wheels. But for someone like Melissa, who had already reinvented algorithms from scratch twice, this wasn't a limitation. It was an invitation to innovate. So actually the first model that we created was to generate the data.
Speaker 13:27 So we trained a model to generate the data. And I think that was an amazing idea because I would never come up with this if not given those constraints. She's describing something that would become a cornerstone of modern AI development, synthetic data generation. But this was years before it became mainstream, before everyone was talking about it. Once again, necessity had led her to reinvent the cutting edge.
Speaker 13:52 And there's something else happening at Ryder that's different from the rest of the industry. What's interesting about Ryder is nobody told me how to do that. They told me you will find a way. Just take your time. Look around, you will find a way. This isn't how most tech companies operate. Usually there are roadmaps, established methodologies, best practices copied from other companies.
Speaker 14:17 But Reiter was betting on something else, that the same curiosity driven approach that led to Melissa's accidental discoveries could be channeled into systematic innovation. When I joined, we had two people or three people in NLP team. Right now we have several teams doing NLP. And I remember my first task was training a model for grammar. It was not obvious what an LM can actually do the job well.
Speaker 14:40 I consider T5 a large model. Just to give you the scale right now, that model was less than 1,000,000,000 parameters. What she is describing is the difference between research and engineering. Most companies in 2020 were focused on implementing existing solutions at scale. Writer was asking deeper questions. What if we could build enterprise AI systems that actually understand how businesses communicate?
Speaker 15:09 What if we could make AI that doesn't just follow patterns, but understands context and intent? Everyone in the industry is relying on benchmarks. That's true. But the benchmark is like a cherry picked use case. There is one piece that is missing in all of that testing. And it's very time consuming because it's going to a human being, asking the human being, could you please use that model in production? Do you like actually talking to the model? She's talking about something they call the vibes test.
Speaker 15:40 And it reveals something profound about how Ryder approaches AI development. So we always say that the VIBES test is the most important. After you satisfy all of those benchmarks, you go and do the VIBES check. While other companies chase benchmark scores, Ryder is asking, does this AI actually work for real people doing real work? It's the difference between optimizing for test scores and optimizing for human experience.
Speaker 16:09 This approach has led to breakthroughs that go far beyond what typical enterprise AI can do. They're not just building chatbots or document generators. They're building AI that understands the nuances of how different industries communicate, that can adapt to company specific writing styles, that can actually improve how people think and work with language. Sky is the limit, right? Even with the current technology, we can go very, very far.
Speaker 16:36 So it's only about imagination. This isn't just one person's success story. Ryder has created an environment where this kind of innovative thinking can flourish, where constraints become catalysts, where the impossible becomes just another interesting problem to solve. There's a larger pattern here that goes beyond Melissa's individual story. Throughout history, some of our greatest discoveries have come not from experts following established paths,
Speaker 17:12 but from outsiders approaching problems with fresh eyes. From Kepler discovering planetary motion while trying to find the music of the spheres to Darwin developing evolution while studying to be a clergyman. But what Reiter has figured out is how to systematically create the conditions for this kind of breakthrough thinking. The thing about quantization and distillation,
Speaker 17:34 every model is different, right? That's why we do this extensive testing because we want to know the characteristic of the model, how to design a system on top of the model. What she's describing is deep fundamental research into how AI systems actually behave, Not just how they perform on tests, but how they think, how they fail, how they can be improved. This is the kind of work that pushes the entire field forward
Speaker 18:01 and is paying off. While other companies are racing to implement the latest trending model, Rytr is building AI systems that work reliably in the messy reality of enterprise environments. So if you have a startup, right, and you rely on API, and then suddenly the API provider changes the model, your entire business is in pieces, right? Because everything that you created so far on top of it, it stops existing. Writers approach understanding their models deeply, controlling their own infrastructure,
Speaker 18:33 optimizing for real world performance rather than benchmark scores. This isn't just good engineering. It's the foundation for the next generation of enterprise AI. But here's the twist. Now that Melissa knows about machine learning, now that she's working with a team of world class researchers, can she still have those accidental discoveries or has knowledge become a constraint?
Speaker 19:00 Those tokens that happened before are very important for the current token. If you change anything in past tokens, you would diverge from the trajectory that you're in right now. So there is a possibility that you would not end up getting the token that you have right now. So I've been thinking about my token right now and I'm really happy where I am. She's using the language of AI, tokens and trajectories to describe her own life.
Speaker 19:22 And maybe that's the point. At Ryder, she's not just applying her curiosity to solve individual problems, she's part of a team that's rethinking what enterprise AI can be. The accidental algorithms of her past have become intentional innovations. And for other researchers and engineers who are tired of chasing the latest hype cycle, who want to work on problems that actually matter,
Speaker 19:48 who believe that the most interesting breakthroughs happen at the intersection of constraints and creativity. Well, Ryder has proven that there's another way to build the future. Melissa Rusak is a research scientist at Ryder where her team has grown from two people to multiple specialized groups, all working on the kinds of fundamental problems that push the entire field forward.
Speaker 20:19 The next time you find yourself approaching a problem without knowing the right way to solve it, remember Melissa's story. Sometimes not knowing is exactly what you need to see something new. And sometimes, if you're lucky, you might find yourself in a place that rewards that kind of thinking. Thanks so much to Melissa Rusak for sharing her story. I'm Elora Weaver, and this has been Humans of AI.
Speaker 20:52 Thank you to everyone for tuning in this week, and a huge thank you to the Humans of AI team over at Ryder. If you enjoyed this conversation with Melissa, make sure you go check out Humans of AI wherever you get your podcasts. And don't forget to check out our own back catalog for our interview with Ryder CEO, May Habib. Thanks for listening, and we'll see you next week.