- (vam)Brace Yourself
- Posts
- 6) Textual Healing
6) Textual Healing
I'm hot just like a corpus, I need some purpose.
Behind-the-scenes building Vambrace AI, a company on a mission to forge stronger relationships with users. Subscribe to follow along or visit the site here:
(typos are to make sure you’re paying attention)
Introductory Remarks
Dear Vambracers —
In last week’s post, User Acquisition, I (somewhat pedantically) explored the importance of users in building a business of any value and put forth a high-level blueprint for acquisition channels in the early “public beta” phase my launch. Of course, user acquisition requires a product for users to use, and I’m pleased to report that we’re roughly on track for end of June launch—and I’m dead set on publishing the product by then. Hopefully next week’s post goes deeper into launch prep and final release. Okay, now for the topic at hand: text analysis.
Text Analysis
In today’s post, I want to discuss some general philosophies, tools, and approaches for text analysis that are on my mind as I: (a) embed myself within the broader commercial text analysis discourse, and (b) educate myself on leading-edge text analysis methodologies that could eventually make their way into our product. As high-level background, text analysis was my favorite course in college, and I love the general concept of applying analytical rigor to decidedly unstructured and non-analytical data (i.e., text).
Crystalized vision
Within the context of Vambrace, our initial intended core use case is user interview analysis (still unclear if this will mostly be discovery, feedback, and/or customer success), which (we think) will mean analyzing text-based transcript data (at least for the immediate future). The goal is to go “deeper for cheaper” offer the most possible depth in the most efficient manner. I intend to leverage any and all tools available to deliver on this goal so long as they improve the results we deliver for our customers. And, to get even more specific there, the “results we deliver for our customers” will be: reduced churn, compressed development cycles, and customer composition optimization (idk what that really means but somehow it has to do with the 80/20 rule). I admittedly need to get a bit more analytical here post-launch in quantifying the value we aim to deliver and the business vectors upon which that value accrues.
But, as I reflected on this goal and my general interest in text-based insights and how that informs my pursuit of Vambrace, I realized that the real vision for the company is to become the leading provider of commercial text analytics in the world. I want to position the company at the forefront of textual insights extraction. I don’t really know what that literally means yet, but I know that that’s our north star. And I envision a future in which we can basically extract leading indicators of user behavior from interview and other text-based data. So we pretty much want to bridge the gap between what humans mean and what humans say—because I’m sure we’d all agree that we rarely say what we mean.
Again, I’m not really sure what this will mean from a literal tool implementation and technique-set perspective, but I know that I’m committed to figuring out what it means and relentlessly experimenting until we’re at the leading-edge of text analytics to derive leading indications of behavior before users even know what they’ll really think (or something like that).
Current capabilities
To that end, we must start somewhere! We’re currently using the following tools and techniques in our initial MVP (launching soon…) to provide a suite of business-based analytics, across individual users and in the aggregate, based entirely on uploaded transcript data.
General-Purpose LLMS
Perhaps unsurprisingly, general purpose LLMs play a big role in our current approach. If you believe that LLMs at least somewhat accurately model the average behavior and intellect of a human, then we’re pretty much using LLMs to comb through every interview and extract the important information and provide quote-based reasons for certain conclusions. From there, the LLM can discern patterns and themes across the entire corpus of interview data. I honesty don’t think there’s anything massively novel here, and I don’t think there necessarily has to be, other than a super comprehensive prompt based off like a 20-page Deep Research report on how to extract insights from user interviews.
RAG (I think)
This might betray my relative lack of experience here, but I’m pretty sure we’re using RAG (retrieval-augmented generation) for our chat experience. All interviews are chunked, embedded, and stored in a vector database and then find the most similar [x] chunks based on the embedding of the user’s query—and then feed the relevant transcript pieces (as well as some other more general knowledge-base stuff around business best practices) into our LLM to provide a comprehensive, actionable, and empirically-supported answer (e.g., “What the top two features that we should build next and why?” // “What are the top 3 reasons that users would churn and why?”). I’m pretty sure this qualifies as RAG, and it helps us ensure that we’re providing highly topical, specific, and empirically-supported answers to our customers when they ask us questions.
A couple other things here:
(1) This is an area where there are interesting and impactful decisions to be made around chunk length, number of chunks retrieved per query, and then broader decisions around knowledge base content that we want to enrich our responses with. And then these decisions impact response depth and relevance, and speed. Just a fun and highly demonstrative example of the impact associated with technical architecture decisions.
(2) This is speculative, but I have high hopes for the chat feature—which is why, over time, I want it to be as compelling as possible. Even in my own testing of the platform based on my own interview data, the chat is where I can really start to see behavioral patterns take hold, because you can ask questions with responses that are augmented by actual user quotes. And then if I can sufficiently setup the architecture to handle like hundreds of user interviews, then I think that’s how we can outcompete with general-purpose LLM projects and artifacts. Easier said than done, but I have felt myself the beginnings of that pull into repeat use and habit formation—which is great!
Highly Analytical Prompting
Finally, this probably goes without saying, but I have highly complex prompts that I created based on a lot of Deep Research work that I “commissioned” around how to extract insights from user interviews. It’s a fun example of how to leverage different AI capabilities at one time, because I was able to get really really smart on user interviews in a matter of minutes, and then could go from research to implementation within hours—in a way that felt pretty robust and actually pretty impressive. We’ll ultimately see if our users agree, but for now, I’m pleased with the depth of the analysis that we offer, which was made possible by our work in researching how to make the most of user interview data.
Potential future areas of exploration
As stated above, the goal of the company is to become the leading commercial text-based analytics provider. I don’t know what that will ultimately look like, but that’s the whale I’m chasing. This may be obvious, but I know that in order to get there I have to start somewhere. And so that’s how I arrived at all the above which delivers the capabilities of the MVP; and then from there I’ll level-up myself and the platform based on our own user feedback and the shifting technology landscape to see where things go.
It is funny, psychologically, because really I do view the MVP as introducing myself to the market, and then really using it as a jumping off point for much deeper analysis and more relevant work. I just think I need to make a good faith effort to “come correct” and show possible stakeholders that I mean business. And then I guess that’s how you can build “brand” which I think really is just logo-based trust in an authentic manner—which allows you to accumulate the goodwill required to take technical risks and bring users with you on that journey. Ya know? But, regardless, some technical areas where I think this could go follow.
Neurosymbolic AI
In the past week, I’ve become privy to a decades-long rift within the world of artificial intelligence between symbolists and connectionists (eerily reminiscent of The Great Divide from Avatar the Last Airbender). As I understand it, symbolists basically believe(d) that you can store massive amount of symbols and logic in computers to effectively mirror human decision-making and/or extract relevant information, across literally millions if not billions of symbols. The pros are traceability (on some general level) and no real mysticism around what the computer is actually doing. On the other hand, connectionists—who pretty much seem to have won / be winning the technical battle—believe that neural networks with limited like intrinsic or embedded knowledge of logic or human input can pretty much generate human-like reasoning and responses when shown sufficiently large amounts of human-generated data.
The connectionist tradition informs the LLMs of today, and I’m pretty sure Ilya Sutskever is a hardcore connectionist, and was trained by Geoffrey Hinton, who I think is like the OG connectionist (also s/o CMU). And so really OpenAI’s main breakthrough was around the amount of compute they threw at intense neural networks with many layers (i.e., deep learning) which got surprisingly good given a sufficiently massive amount of data to train on.
This background is relevant to the current section because, a related area of study is one of neurosymbolic AI, which I think is pretty much a hybrid of symbolism and connectionism, whereby symbolic logic and information storage and retrieval is augmented by connectionist training and models. I’m not positive how this actually woks technically, but I think there’s something that could be interesting there within the context of language.
My understanding is that it might be something more relevant to like word frequency or something, or where I could build like “hardcoded” heuristics around “if [x]% of users express negative sentiment around [y] feature then said feature is elevated to ‘concerning feature’” or something like that. The point here I think is that we could provide more transparency and traceability to users, theoretically, or maybe also use this to implement more like temporal pattern tracking for data visualization or something like that. Think like a super-powered google trends, but for interview data. That could maybe be interesting.
Synthetic populations
Probably the most exciting area to me right now is the concept of synthetic populations. I know there are already some companies popping up to deliver synthetic business solutions, and for good reason. The basic idea is that, at some point, it makes sense to basically conduct entirely synthetic interviews to create, like, massive amounts of data that might be like 85-90% real-world accurate, at like 5-10% of the cost. Specifically, if you have 10 interviews, you could reasonably generate an additional 100 interviews pursuant to some guidance around user personas and edge case guidance and stuff to basically give you deeper insights from a broader population, without actually having to conduct those interviews. There’s another company out there that automates the entire user interview process, including actually having an AI conduct a user interview call (creepy, imo), and so synthetic populations kind of cut out the middle man.
I also think it’s pretty interesting because I’ve kind of done it already to test my product. I created a synthetic universe of users based on Robertson Davies’ Salterton Trilogy and had ChatGPT create personas for each of them, and then generated 2-3 interviews for each until I had about 25 total interviews. And then I uploaded these interviews to my platform, and they were pretty good at letting me stress test stuff. I think there were generative limitations with ChatGPT itself, because I kept trying to get the interviews to be like at least 10 minutes long, and that was challenging to do, even with strict like word minimums—and also I had to like remove silly stuff with like behavioral cues and stuff (e.g., “sighs”). And also I do think that there’s kind of standard behavioral patterns that obtain in AI-generated mock scenarios given substantial overlap in prompts. So I think if there was some way to take some number of real user interviews and then like accurately dictate realistic human variation in synthetic interviews, that could be come very very, interesting.
Or maybe there’s eventually a world where you can run like basically specific synthetic simulations of user sentiment to help you like game plan or prepare for those situations. I’m thinking, “OpenAI just released something pretty much identical to my solution” and then generate like 50 interviews from fake and existing user personas about the blowback there, and how we’d respond (as a company). Or maybe you could do “[x] just happened in the world and now my users are generally more price sensitive” or something and see what happens. I’d have to really iron out business utility and value proposition, but in an increasingly uncertain world it might be cool if you could actually like start to prepare for different outcomes in terms of how your users receive and/or react to certain things. Like even what if you could, before the release of some big new product, run like 4 different synthetic interview simulations across like (1) massive success, (2) modest success, (3) not-your-fault failure, and (4) massive failure. Maybe that could be interesting?
I think the point I’m trying to make here is that, as generative capabilities advance, it might be interesting to really intensely explore use cases in which we can get to like 90% of the utility of solely human-based data generation to like 10:90 human:AI data generation, with like pretty much 90% of the efficacy or whatever. Just something to think about!
NeuroNLP
The last section I’ll talk about here, and the technique that I’m least familiar with, is NeuroNLP. So this is a super high-level sparknotes here, but I think this general emerging discipline focuses on mapping how LLMs reason through text generation and interpretation with how human brains reason through text generation and interpretation. Like if you could theoretically employ techniques to better understand how different areas of the brain light up for certain tasks, then maybe you could also do that same exercise for AI neural networks, and then arrive at deeper truths around how LLMs really think, in a manner consistent with how humans think, and so then maybe you could kind of go one level up on the abstraction spectrum and get into like neural electricity to discern intent.
Because I think one thing that I’ve generally been thinking about a lot is how imperfect language is as a representation of feeling and intent, to ourselves and to others, and how it would be great to somehow use language to discern real feeling and intent. But obviously to do something like that you’d need “feeling and intent data” and map that to language—and then probably cluster based on like psychological profiles and stuff—to actually arrive at like using language to discern feeling and intent. I don’t really know if that makes much sense, but I do know that language is an imperfect prism through which we filter and express our thoughts and feelings—and actually getting to the thoughts and feelings would be fascinating. And, I guess, do a really good job at preventing churn and delighting customers—which would be the commercial relevance of NeuroNLP, I guess.
This is super silly, but there was a random show called Zoey’s Extraordinary Playlist where like a software engineer could hear people sing songs about their deepest darkest thoughts and struggles, and then she would intervene in their life to help remedy the situation or whatever—and form deeper connections in the meantime. And it’s silly, but I think it does get at this general idea that everything we got going on inside us is so much deeper and complex and difficult to say than the actual data that we do generate (i.e., our words—and, I guess, like actions), and so how do we like close that gap is all I’m trying to figure out. Is that too much to ask??
Looking Forward
I really felt the benefit of clarity in vision, purpose, and mission of the company start to crystallize in the past week and it’s been energizing and emboldening—and also urgency-intensifying because I know all that’s out there for us to explore and learn and build, and I do think it’s in line with general tailwinds associated with the development of the human species and our use of technology, and technical capabilities more broadly, and so on and so forth. In short, I’m really excited by the space we’re in and committed to learning as much as I can to get us as advanced as possible as quickly as possible to deliver the best results as possible for as many people possible and somehow uplift humanity in the process. Okay, so, I said “in short,” but then it became long. So, in actual-short, I just want to understand words.
Hopefully I’ll have more concrete information for you next week w/r/t an actual “public beta no promotion” launch and we can start to see how little the world cares about what I’ve built! Which will compel me to become as annoying as possible to as many people as possible until they just give my platform a try. And then we’ll see if we can’t iterate and test our way to something that resembles a legitimately value-delivering business solution. But, much much work to do between now and then. But I’ve never felt more confident in the road I’m on and the shoes I’m wearing.
Have a wonderful week!
Sincerely,
Luke