For junior Karina Halevy, language is quantifiable. As the founder of LingHacks, Karina recently hosted the Bay’s first-ever computational linguistics conference for high schoolers, where she taught other students to combine language, computer science and math to make sense of the messy and dynamic patterns that make up human speech.
But Karina’s journey began with her love for language itself, before she ever learned about what she could do with it.
Growing up in a trilingual household, Karina is fluent in English, Chinese and Hebrew. Through taking classes at school, she has become fluent in Spanish as well. Karina even learned languages at her ballet studio. From hearing her instructor speak Russian, Karina noticed familiar patterns in syntactic structure, piecing together the meaning of phrases solely through context.
“The strategy I use the most is just contextual learning, which is the way I learned Russian,” Karina said. “My teacher was speaking Russian to others and I just put two and two together. Hebrew and Russian share a bunch of cognates that I discovered at ballet. Our gym coach was telling us to do something and I was like, ‘Hey, that was the same word in Hebrew.’”
Karina is now teaching herself to speak Hindi, Urdu, Italian and Indonesian. Although Hindi and Urdu have been more difficult to learn since she hasn’t had experience with any similar languages, Karina embraces the challenge, intrigued that language—at its core—is algorithmic, filled with patterns and grammatical rules.
“I just enjoy taking apart syntactical structures and parsing different languages, and it also has kind of a mathematical field to it,” Karina said. “I love all things algorithmic, so it’s kind of artful and elegant.”
This analytical view of language combined perfectly with Karina’s affinity for math, as she began to apply math to language through artificial intelligence. During the summer after ninth grade, Karina attended a Stanford Artificial Intelligence summer camp called SAILORS, in which she learned to apply mathematics to parse language with computational linguistics.
“My two worlds [of math and language] came together when I attended [SAILORS] at Stanford, where I did a project on natural language processing,” Karina said. “I saw the computational linguistics aspects integrated and I found that super fascinating, so that was kind of my starting point in computational linguistics.”
At SAILORS, Karina developed an algorithm using natural language processing, a method to get a computer to understand the meaning of human words. Karina’s algorithm takes in tweets during natural disasters and labels them as either “food,” “water” or “medical” needs; this information is then used to supply resources to the people who need them.
“You take in a tweet and parse it into features, which is breaking it down word by word,” Karina said. “For each word, the algorithm says, ‘if this word exists in the tweet, then what’s the probability that the tweet will be about this category?’ You then [give] each probability [a weight] and spit out the most likely [category]. Within each of those [assigned] categories, [the algorithm] matches people who needed resources to people who could give them.”
Karina learned from this project that the implications of computational linguistics are far-reaching, so she began to share her passion to others, as both a polyglot and a student of mathematics. On November 18, Karina hosted LingCon, a computational linguistics conference that included interactive coding workshops, a lecture on conducting research similar to Karina’s SAILORS project and a panel session of industry experts from Google and Stanford.
“I thought I could put [computational linguistics and hackathons] together and engage an even broader audience through this intersection of the humanities and the sciences,” Karina said. “I’d say LingCon was a success. [Many attendees said that] the conference definitely gave more insight on what life could be like if [they] pursued computer science. Our feedback showed that students came out of LingCon 16 percent more likely to pursue a career in AI and 40 percent more likely to pursue a career in computational linguistics.”
Many attendees, including junior Anya Sharma, noted that LingCon introduced them to a specific area of computer science that they want to pursue in the future.
“Some of the lectures and workshops they talked a lot about machine learning, not just computational linguistics,” Anya said. “[The conference] taught me that machine learning is applicable to a ton of different fields. Before coming to LingCon I already wanted to pursue computer science, but afterward I realize that machine learning was a more specific interest that I want to explore.”
LingCon served as a “training day” for high school students to learn the skills needed to create a product at LingHacks, which is a 24-hour computational linguistics hackathon that Karina is now preparing to host in the spring.
“[At LingHacks], there’s going to be an extra focus on long-term potential rather than just having something ready to pitch,” Karina said. “We really want to have people think about how they can do research and use math and really dig into how their things work in the long term. My biggest project [currently] would be LingHacks, and I’m starting a research project next semester with an alumni program from the Stanford summer program.”