How Glean CEO Arvind Jain Solved the Enterprise Search Problem – and What It Means for AI at Work
Years before co-founding Glean, Arvind was an early Google employee who helped design the search algorithm. Today, Glean is building search and work assistants inside the enterprise, which is arguably an even harder problem. One of the reasons enterprise search is so difficult is that each individual at the company has different permissions and access to different documents and information, meaning that every search needs to be fully personalized. Solving this difficult ingestion and ranking problem also unlocks a key problem for AI: feeding the right context into LLMs to make them useful for your enterprise context. Arvind and his team are harnessing generative AI to synthesize, make connections, and turbo-change knowledge work. Hear Arvind’s vision for what kind of work we’ll do when work AI assistants reach their potential. Hosted by: Sonya Huang and Pat Grady, Sequoia Capital 00:00 - Introduction 08:35 - Search rankings 11:30 - Retrieval-Augmented Generation 15:52 - Where enterprise search meets RAG 19:13 - How is Glean changing work? 26:08 - Agentic reasoning 31:18 - Act 2: application platform 33:36 - Developers building on Glean 35:54 - 5 years into the future 38:48 - Advice for founders
- Published
- Published Oct 29, 2024
- Uploaded
- Uploaded Jun 11, 2026
- File type
- Podcast
- Queried
- 00
Full transcript
Showing the full transcript for this episode.
AI-generated transcript with timestamped sections.
[00:00] majority of the work that we do today [00:02] is not going to be done by us anymore in five years from now. [00:06] And that applies to me, that applies to you. Like, you know, we both do very different things, but still like, I think, you know, we are knowledge workers. [00:12] and [00:13] I think a lot of our work is actually going to be done by these amazing AI assistants that actually in many ways, you know, more powerful than us. Like, you know, the they they have access to. [00:24] all of our company's data, our knowledge, they have all the context from all the past conversations and meetings. [00:29] They don't forget anything. And they can really sort of, and they have the, you know, on top of that, you know, they have the reasoning capabilities, you know, that allow them to be super helpful to you and like any tasks that you do. So that's sort of, you know, our core belief, like, you know, that majority of our work is actually going to be done by these AI companions or assistants. And we want Gleam to be, you know, that, you know, that assistant, you know, in the workplace. [00:51] *music* [01:08] Joining us today is Arvind Jain, co-founder and CEO of Glean. [01:12] Earlier in his career, Arben was instrumental in building Google search and co-founder and CTO of Rubrik. [01:18] Glean began life as an enterprise search company and today has evolved into a general purpose work assistant. [01:25] Bringing AI into an enterprise context is notoriously difficult because of the integrations, the permissions, the ranking, the parsing, all the other magic that needs to happen to make AI work on your company data.
[01:37] Arvind joins us today to share how Glean is solving this problem where other companies have failed, [01:41] and what he's learned as one of the first successful AI native application companies. [01:46] Arvind, thank you so much for joining us. [01:50] We have a lot of questions about RAG and agents and knowledge graphs and all of that. But before we do that, can you give us one or two minutes on what is Glean and what are you building? Yeah, first of all, thank you for having me. Glean, think of it as the Google or ChatGPT. [02:06] but inside your enterprise. It's a place where your employees go and ask questions [02:11] and Glean answers all of those questions. [02:14] using your company knowledge [02:16] regardless of where that knowledge is. [02:18] means it all back to you. [02:19] Yeah. [02:20] So that's what Glean does. Glean is also an AI platform. So if you want to actually build [02:25] AI applications inside your company. [02:27] you can use the Glean Rack platform to build those applications quickly. [02:32] Wonderful. And since you make the analogy of a Google for work, [02:36] Google for Work, I think, is something that every CIO has described as their holy grail, and we have two decades of failed attempts at [02:43] building it uh you were actually a star search engineer at google before and even google never managed to crack [02:51] this category before. Maybe? Can you just say a word on why is this such a hard problem and how did you do it? [02:57] Well, I mean, search is hard because it's actually magic in some sense. Like you can come and ask any question that you have. [03:05] and you expect the system to actually give you back the right answer
[03:08] So expectations are always high. [03:10] And it's a difficult problem, especially in the enterprises, because there's so much information inside the enterprises spread across so many different systems. [03:20] It's both hard to actually even get hold of that information [03:24] but then even harder to actually make sense of what information is actually good, what has become out of date. So, you know, there is like, you know, lots and lots of challenges around like, you know, building that, you know, system. And, you know, [03:37] And in the past, like, you know, I would say, I would say that like, you know, there were no good attempts made. Because, you know, the problem was so, like, it was so hard. It requires so much R&D, so much investment. Like, you know, it was not really startup friendly in many ways. And you also, like, couldn't even, you know, in the pre-SaaS world, you couldn't even build a product. Because, you know, just connecting with all of your enterprise data meant that you had to spend like an year sitting with an enterprise, [04:07] and then actually solve the real problem, which is make that information searchable. [04:12] Arvind, one of the things that I think is so interesting about Glean is you are probably one of the first and best examples of what an enterprise AI application company [04:23] can or should look like. And we're going to focus most of this conversation on the AI aspects of Glean. However, I know there are a lot of layers to your stack. You've got the infrastructure, you've got the connectors, you've got the governance engine, you've got the knowledge graph. Can you say a couple words about all the stuff you had to build before you even got to the AI part to make the AI work?
[04:42] Absolutely. So as you said, like, you know, search starts first with the data and the knowledge that need to actually make searchable. So the first part of the clean tech stack is these deep integrations that we've built with most common enterprise [04:58] systems. So think of systems like Salesforce or Confluence, Jira, Google Drive, SharePoint, ServiceNow. [05:05] like your enterprise data... [05:07] typically lives in all of these different systems. [05:09] I need to bring it all together in one place. [05:11] So that's the first part of our technology stack is these integrations. But then if you think about like enterprise data, and this is one of the most unique things about enterprise search versus, if you think about Google search on the web, is most of our enterprise information is actually private in nature. [05:30] When you author a document, [05:31] in Google Drive, you know, like this document may actually be private to you, or you may share it with a few other people. [05:37] And you can't build a search engine which [05:39] where you just dump all the company knowledge and make it accessible to everyone. You have to actually understand permissions. [05:45] of each content. So when you go and search [05:48] the system should understand who you are and only retrieve information that you actually have access to. So that's our governance layer, understanding governance across all of these hundreds of different systems, which is quite complicated. [06:00] And then the third part, and this is where really most products have failed in the past, is search is not about [06:07] just putting like a whole bunch of documents [06:11] in an index,
[06:12] and [06:13] And then like, you know, when somebody comes and ask a question, take those words or take that question and just match it up semantically or with keywords, you know, with the right content, you've got to actually also understand who's the person who's asking a question. [06:25] I can come in and ask for [06:29] um, an onboarding guide cause I'm new as a new employee, but then which onboarding guide should be actually, you know, given to me, like it depends on like, you know, whether I'm in the marketing team or I'm in the engineering team. So, so understanding people, you know, [06:43] And understanding knowledge and relationships between them, you know, that's a big part of actually making, you know, a search or a question answering service work inside an enterprise. So we do that. So we actually build a deep knowledge graph, you know, where we look at all the employees, understand what roles do they play in the company, look at all the documents, and then we sort of try to understand what documents are meant for what departments, what documents actually are popular. You know, is there like, you know, what are the relationships between, you know, a particular individual and a particular document? [07:13] and that is what we used as a core foundation that sort of then governs, like when somebody comes and asks a question, what are the most relevant pieces of knowledge for them? So we had to do all of that work and actually, interestingly, you mentioned what came before AI even became relevant. For us, the... [07:31] AI was actually part of the core search technology from day one. We were actually working with LLMs [07:37] in 2019. [07:39] or I mean, at least language models, like, you know, these were, you know, bird-based language models that you could use. And so when Glean... MLMs.
[07:47] - That's right, is that the name for them now? But yeah, in the search, like engineering, [07:54] sort of community, we just called them language models at the time. And so the language models were actually part of the core search experience from day one because it really allowed us to understand [08:07] content content at a semantic level. So that we sort of already baked in, like in our course search experience on day one, when we're actually trying to look at a question from a user. We were never sort of limited by the actual exact keywords that users use. We were able to actually understand the meaning behind the question and actually match it up with the right documents. But still, that's sort of like all the work that you have to do before you can actually even [08:32] you know, do anything with LLMs. [08:34] Can you say a word about rankings? Like, I think part of what makes Google work so well is, like, you know, I always get the answer I want at the top of the page. Right. [08:42] You know, in the case of the public internet, you have so much [08:45] web data and links and all that to make the rankings really work. To what extent is that part of the magic for Glean and how do you guys do it? [08:55] Yeah, so that's of course the core of the product is all the effort that we put in [09:00] to actually build a really good ranking system for our search. [09:05] I'll give you some examples of the kind of things that go into determining what documents are the best ones to rank for a given question. So, of course, if you imagine that there's a document that [09:17] people in the company are constantly looking at.
[09:20] So that gives you obviously a signal that there must be something about it. Like, you know, it's actually important. People like to spend time on it. If there's a document that was actually written in the last... [09:32] you know, one or two weeks, [09:34] and there's some engagement around it. You know that this again is information that people care about. It's not actually become obsolete yet. [09:43] Then if you think about [09:45] the [09:46] a particular... [09:48] document that we see is not popular like [09:52] When you look at a company level, but you look at one individual team inside the company, we see there's heavy usage for that document inside. [10:00] know inside that particular team so that sort of tells us more about that hey this document may actually be relevant for this particular [10:06] you know, you know, set of people. Or the last thing, the last example, imagine that, you know, somebody had a question and they're not like bothering to a search. They go in Slack and ask a question. [10:19] and then somebody else posts a link to a document. [10:23] as a response to that. [10:24] And the person who asked the question gave a thumbs up to it. [10:28] Just imagine this interaction, what it means. It actually means that that particular document was actually a really good answer for that question that the user had asked. [10:38] And so if you keep that association in mind, like it's going to help you later when somebody else came and asked a similar question. So those are some of the signals. We have to sort of constantly look for all these signals. You know, they have to collect them differently.
[10:52] in the enterprise setting than on the web. Google only has to look at all the activity that's happening [10:59] right on Google itself, because that's the gateway to any sort of knowledge quest. But if you look at in the enterprise, not all the things are happening through the search paradigm. So you have to go and look at all the activity around all the knowledge. [11:13] in different systems, like in your communication systems, your document systems, and just try to learn from that human behavior. Because ultimately, that's how you learn. You learn from what people are doing inside the company. The more you can actually collect that information, the better your ranking systems are going to be. [11:30] Can we spend a minute on RAG? As Pat mentioned... [11:34] you were kind of in the right place at the right time. You had put together all the hard stuff so that when, you know, when the LLMs got really good, you kind of had all the infrastructure in place. And I think you've been one of the experts in – [11:48] you know, using RAG to make these LLMs actually [11:51] useful on your corporate content. Can you explain, Rag, to me, like I'm five years old, and like what are the secrets to making it work? What are the things that people don't talk about? [12:01] What are examples of things that you can do thanks to RAG that you can't in a generic chat interface? [12:07] Yeah. Well, so first of all, I think since you were talking about a five-year-old, let's first talk about what drag is. Okay, 10-year-old. 10-year-old. So, yeah. No, let's keep it five. I need five. Let's start with five. Let's start with five, yeah. So, I mean, like, if you think about all these amazing models, you know, in GPT and Gemini and Claude, these models are all trained on the world's public knowledge.
[12:30] and data. [12:32] And so if you were to actually go and chat GBD and ask a question like, hey, how many days do I get off with my PTO policy? [12:39] It has no idea. It can't answer that question because that's my company's private knowledge. [12:45] You know, the answer is somewhere there. [12:47] and the model is not trained on it. [12:49] So how do you bring [12:50] your private enterprise data. [12:53] Um... [12:54] to these models so that you can actually have AI create that magic for you. [13:02] That's what a RAG-based AI application architecture allows you to do. [13:07] So the way it works is that, you know, you come and ask a question and you have a search engine. [13:12] or a retrieval engine, whatever you want to call it. And given the question, this retrieval engine [13:18] you know, finds, you know, potentially relevant, you know, documents, you know, that could actually answer your question. And then you're going to take those documents or those sort of content fragments, [13:29] and make the model work on it. You'll tell the model like GPT that, Hey, I have this question and I have this company knowledge that I think is relevant, you know, in terms of answering that question. [13:39] Now you answer that question using this knowledge. [13:41] So this is how most AI applications today are being built in the enterprise. [13:47] The only way to actually connect [13:49] Um, [13:51] you know, your private enterprise data to, to the power of these language models is basically a search engine that's sort of sitting and sitting, sitting, you know, sitting, sitting, [14:00] you know, in the middle. So, so we like, you know, our given, you know, that, you know, at Glean also, like, you know, we of course built a search engine or all of your enterprise content, you know, in the last, you know, five years, it actually allowed us to actually be, you know, so become like, you know, one of the best, you know, rag systems, you know, that allows you to now not only like, of course, you know, we deliver our own end user application, which is a Glean
[14:30] you know, companies to actually build, you know, more and more applications using RAG. Now, I think like in terms of while like, you know, this is the architecture that is sort of, you know, emerging as a canonical architecture for building AI applications, I think it is still like, you know, full of challenges. It's actually really hard. [14:51] to build [14:52] great AI applications using RAG. Because one of the things, you know, like, you know that how models themselves are sort of, while very powerful, they're also still an emerging technology, right? Models hallucinate, you know, they make things up. And what you're doing now is you're actually, you know, adding one more complex layer of technology. [15:11] you know, you know, you know, in this application architecture. So think of it like you're chaining two things, you know, which are both not perfect. So oftentimes, you know, you will see a rack based AI application not perform well, because you asked a question, [15:26] And the failure actually happens at the racks, you know, at the retrieval stage. [15:29] where you didn't even actually [15:31] you know, were able to find the right pieces of knowledge, [15:35] Or maybe you found like, you know, stale information, [15:37] that then you're actually giving to the LLM to work on. And then, of course, it's going to give you bad results. So it actually, you know, while it's the only way to sort of bring knowledge together, it creates these interesting challenges for you as well. [15:52] Let me ask you a question. And just to paraphrase a bit what you were saying at the start of this conversation, Act 1, Enterprise Search, Act 2, Application Platform. For that Act 1, which is Enterprise Search,
[16:05] How do the concepts of enterprise search work? [16:08] and RAG relate? Is one a superset or a subset of the other? Are they similar but distinct? Are they the same thing? How does enterprise search and RAG, how do those concepts relate? [16:20] So I think like, you know, I think of search and drag as [16:24] being in some sense, you know, they're one and the same thing. The real core technology is taking all of your [16:31] all of your knowledge, enterprise knowledge, and [16:35] putting them into this search system [16:39] where now you can actually ask questions [16:42] and the system is able to actually give you relevant pieces of information back. So that's that's sort of the core technology. Now you can actually use this technology either as a standalone [16:55] So that's what the Glean search product is, for example, where, you know, people come in, they ask questions and we can give them the relevant documents, you know, that potentially are useful to, you know, to them based on that question. [17:06] Or you could actually use this as an API layer in your overall AI application. So where the search system, the search module now is only one component of your overall AI application architecture. [17:24] Um, and, um, [17:26] So I think in that sense, it's sort of similar. But the industry, on the other hand, I think what we've seen, most of these rack-based applications,
[17:36] in the enterprise today. [17:38] they actually use a much more simpler version of retrieval system [17:42] in their RAG application. [17:44] typically like a vector search based system, [17:47] which doesn't really have full enterprise context, [17:50] And so that's, I would say, that's the key difference. So for us, our approach always has been [17:57] that really think about how to build a standalone search system, something that is as good that you can actually put them in front of the users. [18:08] as a standalone product. That's really the real test for [18:11] like how good the search is. And that then actually, when you've put it behind the scenes in a rack-based application, it's actually going to actually create better AI experiences. [18:23] So is it fair to say that kind of the magic that you've done in terms of getting a good ranking of search results [18:28] That is exactly like you've made that ranking good for people. It turns out making that ranking good for people is also what you need to make the ranking good for machines in order to get... [18:37] in order to get the best possible results. And that's why what you've built is very different from somebody that's just DIYing a data pipeline in their own little retrieval system. [18:46] Yeah, that's correct. I mean, I think it's really hard to build these systems, you know, like yourself and build them in a matter of weeks, I think. [18:56] I think you can build a great AI demo in one day, maybe in two hours now. But I think to actually build where it's robust, it's stable, it actually adds value to your...
[19:10] you know, within your enterprise, you know, like, you know, it's a hard problem. [19:13] So we've talked a little bit about how you've built what you've built, and we know that it's working. We know that the company is quadrupling year over year, and we use it here internally. And there are a lot of happy people out there who are customers of yours. Um... [19:26] The real measure of success in some ways is how your product is changing the lives of its customers. And so I'm curious to hear from you when you look at your customers and sort of how they operate day to day, pre-glean versus post-glean. What are some of the changes you notice? How does this help people do their jobs? So Glean is actually a product that is used quite heavily by people. [19:53] The... [19:55] There's many, many different types of things. We're often surprised by what people are using Glean for. But I would give you a few examples. So for engineering teams. [20:08] you know, they find clean, super useful, like in terms of like troubleshooting, whenever you run into any kind of a roadblock, [20:14] or an issue, like sometimes an error, like, you know, [20:18] you know, programs are not working properly. And so GleanServe has a really good troubleshooting tool for them. Like, you know, it's a place where you go and debug because, you know, you post the issue more often than not, you're not the first one, like who's going to experience an issue like somebody else has experienced those issues before. So just getting the context from all of those, [20:36] you know, uh, like all, all the other people and like how they solve that problem before, like, you know, sort of helps you move, you know, like solve that issue for yourself. So that's, you know, that's, that's a, that's a big, um, like, you know, big, um,
[20:49] use case for engineering. [20:51] For some roles, like for support, their life day in, day out is about resolving, answering people's questions. And I think the tool like Lean actually fundamentally just changed how they work now because by default now they don't think about giving a question. [21:15] trying to go and look for like answers in different knowledge bases and whatnot. Like, you know, instead, like the first, you know, reaction, you know, that they have now is that, you know, there's a question that's coming in from a customer and then Glean on the side is actually already answering those questions for them. So, so there's sort of like model of working chain, you know, changes from, [21:35] trying to find things to actually trying to validate like, you know, what AI is telling them as is the right answer and then just share that back with users. Some teams are actually like, you know, really change their behavior like salespeople, for example, they use Glean as a way to prep for meetings. So before like customer call is coming up, they will just ask Glean, like, you know, they can be lazy, they can ask Glean, like, you know, help me prep for this meeting. [22:01] and Glean is actually going to bring that 360 view of all of the data from that customer, like what happened in the last meeting. [22:08] who's in like what opportunities are open with them and things like that. So it sort of really helps them. [22:15] you know, prep for a meeting, then actually run a meeting well, you know, because, you know, customers always have lots of questions. And so the salespeople feel more confident, you know, like running that meeting, because like, you know, if somebody throws a curveball at them, you know, they can just like, you know, just ask clean, like right in the meeting, you know, get the answer and quickly sort of, you know, get the responses back.
[22:36] So in fact, like, you know, [22:37] in our company of we don't allow [22:41] um sales people to actually bring in sales engineers in the call like they have to answer those questions themselves so that's like you know one change in behavior that we drive [22:48] like in the first few calls. [22:52] um yeah those are some of the things but overall the use cases are unbounded like the one that i that i think is uh the one that's universal across everybody you know inside the company is you know finding other people who can help you that is that is one of the things you know that clean makes it really easy for people right you know we we help you connect you know uh [23:10] with the right subject matter experts, based on what questions you have. So that's one thing that we see everybody in the company make use of a lot. [23:19] Is there a North Star metric you track? These are wonderful stories of customer impact. [23:24] I guess, how do you benchmark yourselves objectively? Yes, our key metric is how many questions people ask. [23:32] on a daily basis. [23:34] and actually get successful [23:36] Like, you know, we were successful in answering those questions correctly for them. [23:39] Hmm. [23:40] So similar to like Google's like search set metric then. Like, okay. [23:45] Yeah. [23:45] Can you share anything about those numbers, or do you prefer to keep it private? Well, we have this technical metric. I don't know how much sense it's going to make, but we tend to actually keep that number at 80%. So I think it's a proxy for that. [23:58] the 80% of the sessions that users had with us, they were actually successful in getting what they needed. [24:04] And is that, do you measure that success? Is that,
[24:07] explicitly they thumbs up this was good or is that implicitly they take action on the basis of the results that you served and you can see that action taking how do you how do you actually measure the success [24:19] It's actually implicit. So the we will [24:23] track their actions. For example, in search, when you come and ask a question, and then you click on one of the top two or three results, [24:31] and go to the destination [24:34] and then stay there for a long time. [24:36] So that sort of gives us an indication that you are happy. [24:39] You didn't come back and ask another question quickly or refine your search. So that's sort of how we track whether somebody is successful or not. Got it. What are some of the top things that are not in the product yet that you think will make people more successful? [24:52] I think the, like I was, I referred to this, you know, when we started that building a product like GPT or clean, [25:00] It's sort of like magic. Like, you know, the expectations are infinite. [25:03] And because you're supposed to basically not just answer any questions that people have, [25:08] but also like, you know, perform like any task that they actually ask you to. [25:11] you know, to do [25:12] And [25:13] So for us it's not so much about [25:16] what features are missing, like the big thing that we have to actually keep working on. [25:21] is actually be successful at this core feature, which is like, you know, answer people's questions correctly and answer, [25:28] questions of [25:29] like the higher and higher complexity correctly for them over time. So we feel like us or anybody else out there today,
[25:39] We're all very, very far away from [25:41] that true vision for our product, which is that we want Glean to be that AI assistant [25:47] that can actually [25:48] answer any questions that you have using your company knowledge. [25:52] that can actually do half of your work for you. [25:54] in the future. [25:55] I think I would say maybe 2% of the way there. AI for all said and done, we're still in very early stages. [26:04] of like making, you know, that impact. [26:08] So we're only 2% of the way there. I'd love to ask you actually about agentic reasoning. It's something that's been on our mind a lot as a partnership at Sequoia, and I know it's been on your mind as well as a founder [26:18] And one of the results that I was really impressed by [26:22] in the coding space was that with RAG, I think these coding agents can get to 3%, 4% [26:29] completion rates but if you give them more agentic reasoning capabilities they can get to [26:35] 14 15 um so like a multi-fold improvement and you know it's as simple as [26:41] you know, go reflect on what you just said or best event or, you know, whatever the techniques are. I'd love to I'd love to understand how you guys are thinking about incorporating more agentic reasoning. [26:51] into your products. [26:54] And anything else to kind of get us from that 2% where you said we are today to what you hope to build one day? [27:01] Yeah. [27:02] And I want to clarify that 2% is something that I was making up. Like as you know, it's not a measured number. Yeah, yeah. I just wanted to sort of, you know, express like, you know, like how
[27:13] how early things are today and how much amazing things we're going to see in the future. Like I was just basically trying to talk more about that. But in terms of like, you know, [27:24] agentic [27:27] um, [27:28] sort of like a, you know, behavior. Um, [27:32] One of the things that we are [27:34] um, [27:35] doing on that front is [27:37] first [27:38] Try to actually get a lot of [27:42] input from our users. So we have we have a concept of building a workflow and [27:47] Um, [27:48] inside Glean to solve to actually answer a complicated question. [27:53] And [27:54] And today we actually seek a lot of help from our users. [27:58] in sort of completing that workflow. We'll actually, say for example, if you come and ask a question, [28:05] Um, [28:06] like help me write a [28:08] weekly status report of all the work that my team did. So this is your question. Now, if you think about this question, like, you know, it's complicated. Like, you know, there are a few things you need to do to actually go and really figure out the answer to it. [28:21] The first thing is you have to actually understand, you know, what do you mean by your team? Like who's your team? [28:26] You have to maybe go in your HR system, [28:29] Try to figure out who are the people who report to you. [28:32] Then we are talking about work, like, you know, like what, like, so where does work happen for each one of these team members? You have to sort of build an understanding of that. [28:39] and then go sort of pull a bunch of knowledge from all these different systems. So I think right now what we're doing is we're actually...
[28:48] trying to actually get help from our users. [28:51] And [28:52] And we will sort of create a plan. [28:54] for a complicated question, try to actually get the user to actually input and tell us like, you know, are we getting it right? Sometimes, you know, [29:00] users can actually explicitly, like they can completely ignore what we do, and just build a workflow on their own. [29:05] And I think that's going to be essential [29:09] for us to [29:11] like build that, you know, fully agentic behavior for the future. I think some, you know, you can build agentic behavior for, [29:18] like specific narrow set of problems [29:20] But in Glean, since our footprint is so wide, there is the range of questions that people can have, the range of tasks that they want to perform is so broad. [29:31] that we feel like, you know, first we have to learn. We have to learn from, like, you know, workflows that people are going to actually create manually. Um... [29:38] and [29:39] and then build these models, which can then sort of take complicated questions in the future and automatically build those, convert them into these, [29:47] uh... sort of like agent loop or [29:49] or a complicated workflow. So that's the approach that we are taking on it. So you're saying since you have such a broad surface area, you can't build [29:58] agentic reasoning for every single possible task. And so instead you're exposing a workflow engine [30:03] for your users to individually be able to build. [30:06] different automations and different agents. [30:09] - Yeah, and then you learn from it. And then you learn from it. So once you see people building these workflows, that sort of then fits as, goes into a training
[30:21] data set to sort of allow you to actually automatically build [30:25] um... [30:25] new workflows based on complicated questions that people have. Those agility capabilities are coming, but I would say, [30:37] When it is hard for you to answer simple questions, then [30:42] if you want to do complicated tasks, it's equally hard because you're going to make a mistake. And imagine an agent that actually breaks down a complex task into 10, into a series of 10 individual tasks. Then your error rate is going to compound. Each step is 90% accurate. So it is incredible, but I think it is still... [31:04] something that we are [31:07] Um, [31:08] I feel like the human assist is actually critical in building these... [31:14] complicated workflows. [31:17] It's also, Arvind, it might be worth saying a word. Maybe this is obvious to people who are listening, but just to say it explicitly, you [31:25] how act one which is the enterprise search business [31:28] gives you the moral authority or the unfair advantage to get into act two, which is the application platform or the or the platform for agentic behavior. [31:38] It may not be totally obvious to people how Act 1 leads to Act 2. Can you just say a couple words about that? By building the search product, [31:46] which immediately adds value to our customers, to our users.
[31:51] We are able to actually solve a bunch of complicated problems that you will typically run into an enterprise. [31:59] the [32:01] The first part of that is security. So if you think about... [32:05] like the green product, you know, we are actually telling our customers that, Hey, give us all of your data and [32:11] you know, and we're going to hopefully do something useful for you after you give that data to us. And that's a big demand. Like, you know, it's not easy... [32:22] you know, for companies to actually trust like new, you know, a new product company, a startup, and with all of their data, and they're actually not, you know, getting any immediate value from it. And so that is one of the things, you know, that, you know, we've seen to be, you know, super helpful to us because we actually have this search product that people understand that people want, and they want to deploy. So it's already deployed now. And so Glean is running and it's already connected with all of your enterprise data. [32:52] inside the company [32:53] And so then like, you know, helping, you know, like, you know, us going to our customers and then saying then that like, look, you know, use that as your core. [33:01] core AI data platform is a much easier sell because we don't, we're not actually have to convince them again to, to actually like, you know, to, you know, give all of the data to us. It's already there. I say this, this might not be a perfect analogy, but hopefully it's not a terrible analogy that, you know, Tesla had an advantage in self-driving because they're already selling cars. You guys have an advantage in delivering AI agents because you're already selling a data platform that organizes all the enterprise information, makes it accessible, makes it secure, kind of puts people in a position where they're already asking questions of it.
[33:31] of a logical next step to ask it to start taking actions. Absolutely. [33:34] I think you also announced a set of APIs that let developers build on Glean. Maybe say a word on that. I think that was in response to customer demands. What makes developers want to build on Glean versus directly access their own data? [33:47] I think it's probably a similar effect to what you just talked about. [33:49] Yeah, so like a lot of like AI applications that our customers are wanting to build. [33:55] They [33:56] they need to actually tap into data that lives in like, you know, multiple different like SAS, you know, cloud based SAS systems. And, and like, I think it's quite tedious for them. [34:08] to like first go and actually bring that data [34:12] like in one place, build a search or retrieval layer using that. [34:17] The integrations are hard. [34:20] understanding permissions and governance is really, really hard on that. And I think when people, as these models actually became accessible and developers started to actually develop AI applications, they realized that the, well, they were really excited about building these new cool AI apps, but basically they realized that building an app, the 90% of the work was actually this boring infrastructure work that they didn't want to do. [34:49] Like bringing data from all these different systems, running this ETL and data pipelines, and then sort of building a good search over it. And so you'll spend so much time before you actually even get to play with AI.
[35:02] And so that's the thing that they find very useful with Glean because we're actually solving all the problems around ETL, building a great search, properly obeying governance within your company. All of that stuff is done for you. You just have a search API and you can... [35:22] sort of focus all of your attention on the business problem that you're working on and how AI can help you sort of like, you know, achieve that automation that you're looking for. [35:30] In some ways, all the hard work you've done to ETL and put all the data together with data governance reminds me a lot of Snowflake. You're really doing it with text data and unstructured data, but just that central data platform that companies can build around, build apps on top of. [35:49] Reminds me a lot of the snowflake story. [35:52] Yeah. [35:53] Arvind, can we ask you a question about future state? If you allow us to dream for a few minutes, five or ten years from now, [36:02] How do you think Glean is showing up inside of a business? And maybe more importantly, if you're the typical knowledge worker five or 10 years from now and you are equipped with Glean, what is your life like then? [36:14] That's a great question. I think let's keep it five years instead of 10. And I think the [36:22] Well, I mean like one belief that I have [36:25] is that majority of the work that we do today [36:28] is not going to be done by us anymore in five years from now. [36:32] And that applies to me, that applies to you. Like, you know, we both do very different things, but still like, I think, you know, we are knowledge workers.
[36:38] and [36:40] I think a lot of our work is actually going to be done by these amazing AI assistants that are actually in many ways, you know, more powerful than us. Like, you know, they have access to [36:50] all of our company's data, our knowledge, they have all the context from all the past conversations and meetings. [36:56] They don't forget anything. And they can really sort of, and they have the, you know, on top of that, you know, they have the reasoning capabilities, you know, that allow them to be super helpful to you and like any tasks that you do. So that's sort of, you know, our core belief, like, you know, that, like, you know, more, majority of our work is actually going to be done by these AI companions or assistants, and we want Glean to be, [37:16] you know, that assistant in the workplace. [37:20] We want Glean to be the... [37:25] the place where most of your work happens. One of the things that we also think is going to change is today, a lot of AI is about [37:35] is about like, you know, you go and seek help from these AI agents. [37:40] Um... [37:41] the you you you know like for example you go and ask questions you get answers back but but the future is sort of [37:51] future is where this assistance is going to be proactive. [37:54] It's sort of like if you think about if you have an executive assistant, [37:59] You know, they actually help you a lot. [38:01] Um, [38:02] A lot of their help is when you go and ask them for help, but a lot of their help is actually proactive in nature. They tell you what to do next. They manage your day. They know everything about your work life.
[38:12] and they guide you to be effective throughout the day. And I think AI is going to actually allow that [38:18] luxury, regardless of who you are. Today, some executives in the company [38:26] you know, have that luxury, but in the future, like, you know, everybody's going to have these really powerful [38:31] you know, AI based. [38:32] assistants that are going to actually help them do their work. So we're really excited about bringing that change to the workplace and hope that Glean can be the world's most successful AI assistant. [38:46] Love it. [38:47] Arvind, can we change gears a little bit? And I'd love to step back and hear your advice for other founders. You are one of the most successful application level AI companies. [38:56] I think probably number two behind Copilot and Scal, and you did it as a startup, as an independent startup. [39:03] I think you've also had to navigate some unique challenges, right? Like OpenAI, for example, is one of your providers and also one of your competitors, one of your top competitors. [39:12] Maybe just tell us what that dynamic is like. [39:15] Well, I mean, first of all, the... [39:17] from a point of view of like building a startup, [39:21] In fact, I've been actually coding you guys in many places, like Pat, I remember [39:27] like the slide where you talk about [39:29] like the overall software market being 600 billion dollars [39:33] But then AI is expanding that market to like 15 trillion or 12 trillion, something massive. And that's actually the reality of where we are today, which is that
[39:47] everything that we do is going to actually change, fundamentally change. [39:51] AI is going to be a key component that's going to drive that change. [39:56] So, so first thing that I like, you know, as a founder, like, you know, I don't actually worry about what other people are actually working on. [40:02] Because even if all of us are working on a lot of great things, it'll still not be enough. It'll still not be enough to actually solve all the problems that need to get solved. [40:11] And so that's the first mindset. So like, you know, like I think from a advice to other founders, that's the thing I want to tell them. Like if you found a problem, [40:19] you know, just go, you know, work on it and don't worry about like if somebody else is solving it. [40:23] because the chances are that other people are not. [40:26] and they won't solve it the same way as you will. [40:28] Um, [40:29] But for us, coming back to Glean specifically, the dynamics for us, we felt the same way. For the first four years of our existence, we were working on a problem. [40:41] you know, where we had no competition. [40:43] you know nobody nobody was actually interested in solving the problem that we were solving [40:48] It was a dead market and we had to create a category which actually generate interest, be evangelical. [40:55] but we knew that we were working on an important problem [41:01] But then, you know, suddenly, you know, chat GPD happened and searches become hot. And now, in fact, [41:08] Every company that you go and talk to wants to build a product like Lean. [41:11] And so is that good news for us? Is that bad news for us? Like, how do you think about it? You know, from, from, from our perspective, like, it doesn't matter. Like, you know, either way we feel is that it's actually great news for us. Like now everybody's interested, everybody wants to buy our product and yes, we have to compete with many, many other vendors, but, but that's, that's the place where we think, you know, we'll win because, you know, we will,
[41:33] you know, like we have the desire [41:36] to actually solve this problem [41:38] and stay focused on this problem, keep working on it, and there's no reason for us to not do a better job than others. [41:44] - Yeah. [41:44] Part of what I heard in there was that building an AI company is just [41:48] building a company, find an important problem, and solve it in a compelling way. I'm curious, particularly because this is not your first rodeo. Rubric was obviously wildly successful, and of course you were pretty core to some of the early days of Google. [42:02] How much of building an AI company is just [42:05] building a company versus things that are AI specific in some way. [42:09] It's a great question. [42:11] I... [42:12] I think like... [42:14] I think of AI mostly as a tool. [42:17] you know, [42:18] in your arsenal [42:20] and one of the tools [42:22] I don't think [42:23] you suddenly become a different company because you are [42:27] because you're doing something with AI. [42:30] In reality, I think there's going to be no new company that is not going to use [42:35] you know, AI technologies in some shape or form. [42:37] So my point of view is, [42:39] that like, look, you have to actually find a business problem that you're planning to solve. [42:44] And hopefully, like, you know, the... [42:48] you can actually solve that problem in a much better way, because of the technology that AI is actually providing to you now. So I don't think it actually changes. I don't think it actually feels different. [43:00] like you know whether you are [43:03] We don't think of ourselves as an AI company for that matter either.
[43:07] Would you ever train your own models? [43:09] I guess maybe more broadly, how do you think about where Glean's core competencies [43:13] start and stop and you know, you get 100 R&D chips, where do you want to place them? [43:17] We don't have plans to train super large models. But at the same time, we do train models which are smaller in size. [43:28] for every individual customer of ours. [43:33] these language models that we train for [43:37] an individual customer, it sort of goes through all of their [43:40] their own enterprise corpus and sort of starts to understand, like, you know, the sort of the lingo, the speak, you know, the acronyms, the code names, all of that stuff, you know, within their corpus. So, so model training is actually, you know, like a core part of the clean core technology, but, but not in the sense of like, you know, you know, training a model like GPT-4, you know, we don't do that. We don't have plans for it. We plan to partner with a lot of other great companies, you know, that do that build models of that scale. [44:09] Wonderful. Arvind, thank you so much for joining us today. This was a wonderful conversation. We really appreciate it. [44:15] Thank you for having me. [44:17] Thank you.
Want to learn more?