Jay Dawani is Co-founder & CEO of Lemurian Labs – Interview Series

News

Jay Dawani is Co-founder & CEO of Lemurian Labs – Interview Series

admin

April 11, 2024

Jay Dawani is Co-founder & CEO of Lemurian Labs – Interview Series

Jay Dawani is Co-founder & CEO of Lemurian Labs. Lemurian Labs is on a mission to deliver reasonably priced, accessible, and efficient AI computers, driven by the assumption that AI mustn’t be a luxury but a tool accessible to everyone. The founding team at Lemurian Labs combines expertise in AI, compilers, numerical algorithms, and computer architecture, united by a single purpose: to reimagine accelerated computing.

Are you able to walk us through your background and what got you into AI to start with?

Absolutely. I’d been programming since I used to be 12 and constructing my very own games and such, but I actually got into AI once I was 15 due to a friend of my fathers who was into computers. He fed my curiosity and gave me books to read reminiscent of Von Neumann’s ‘The Computer and The Brain’, Minsky’s ‘Perceptrons’, Russel and Norvig’s ‘AI A Modern Approach’. These books influenced my pondering rather a lot and it felt almost obvious then that AI was going to be transformative and I just needed to be an element of this field.

When it got here time for university I actually wanted to review AI but I didn’t find any universities offering that, so I made a decision to major in applied mathematics as a substitute and a bit of while after I got to college I heard about AlexNet’s results on ImageNet, which was really exciting. At the moment I had this now or never moment occur in my head and went full bore into reading every paper and book I could get my hands on related to neural networks and sought out all of the leaders in the sphere to learn from them, because how often do you get to be there on the birth of a brand new industry and learn from its pioneers.

In a short time I spotted I don’t enjoy research, but I do enjoy solving problems and constructing AI enabled products. That led me to working on autonomous cars and robots, AI for material discovery, generative models for multi-physics simulations, AI based simulators for training skilled racecar drivers and helping with automotive setups, space robots, algorithmic trading, and far more.

Now, having done all that, I’m attempting to reign in the price of AI training and deployments because that shall be the best hurdle we face on our path to enabling a world where all and sundry and company can have access to and profit from AI in essentially the most economical way possible.

Many firms working in accelerated computing have founders which have built careers in semiconductors and infrastructure. How do you’re thinking that your past experience in AI and arithmetic impacts your ability to know the market and compete effectively?

I actually think not coming from the industry gives me the good thing about having the outsider advantage. I actually have found it to be the case very often that not having knowledge of industry norms or conventional wisdoms gives one the liberty to explore more freely and go deeper than most others would since you’re unencumbered by biases.

I actually have the liberty to ask ‘dumber’ questions and test assumptions in a way that the majority others wouldn’t because numerous things are accepted truths. Up to now two years I’ve had several conversations with folks inside the industry where they’re very dogmatic about something but they’ll’t tell me the provenance of the thought, which I find very puzzling. I like to know why certain selections were made, and what assumptions or conditions were there at the moment and in the event that they still hold.

Coming from an AI background I are inclined to take a software view by where the workloads today, and listed below are all of the possible ways they might change over time, and modeling the complete ML pipeline for training and inference to know the bottlenecks, which tells me where the opportunities to deliver value are. And since I come from a mathematical background I prefer to model things to get as near truth as I can, and have that guide me. For instance, we’ve built models to calculate system performance for total cost of ownership and we are able to measure the profit we are able to bring to customers with software and/or hardware and to higher understand our constraints and the various knobs available to us, and dozens of other models for various things. We’re very data driven, and we use the insights from these models to guide our efforts and tradeoffs.

It looks as if progress in AI has primarily come from scaling, which requires exponentially more compute and energy. It looks as if we’re in an arms race with every company attempting to construct the largest model, and there appears to be no end in sight. Do you’re thinking that there’s a way out of this?

There are at all times ways. Scaling has proven extremely useful, and I don’t think we’ve seen the top yet. We’ll very soon see models being trained with a value of not less than a billion dollars. If you wish to be a frontrunner in generative AI and create bleeding edge foundation models you’ll have to be spending not less than just a few billion a yr on compute. Now, there are natural limits to scaling, reminiscent of with the ability to construct a big enough dataset for a model of that size, gaining access to individuals with the suitable know-how, and gaining access to enough compute.

Continued scaling of model size is inevitable, but we can also’t turn the complete earth’s surface right into a planet sized supercomputer to coach and serve LLMs for obvious reasons. To get this into control we’ve several knobs we are able to play with: higher datasets, latest model architectures, latest training methods, higher compilers, algorithmic improvements and exploitations, higher computer architectures, and so forth. If we do all that, there’s roughly three orders of magnitude of improvement to be found. That’s one of the best ways out.

You might be a believer in first principles pondering, how does this mold your mindset for a way you might be running Lemurian Labs?

We definitely employ numerous first principles pondering at Lemurian. I actually have at all times found conventional wisdom misleading because that knowledge was formed at a certain time limit when certain assumptions held, but things at all times change and that you must retest assumptions often, especially when living in such a quick paced world.

I often find myself asking questions like “this looks as if a extremely good idea, but why might this not work”, or “what must be true to ensure that this to work”, or “what will we know which might be absolute truths and what are the assumptions we’re making and why?”, or “why will we consider this particular approach is the most effective solution to solve this problem”. The goal is to invalidate and kill off ideas as quickly and cheaply as possible. We would like to try to maximize the variety of things we’re trying out at any given time limit. It’s about being obsessive about the issue that should be solved, and never being overly opinionated about what technology is best. Too many people are inclined to overly deal with the technology and so they find yourself misunderstanding customers’ problems and miss the transitions happening within the industry which could invalidate their approach leading to their inability to adapt to the brand new state of the world.

But first principles pondering isn’t all that useful by itself. We are inclined to pair it with backcasting, which mainly means imagining a super or desired future end result and dealing backwards to discover the various steps or actions needed to understand it. This ensures we converge on a meaningful solution that is just not only progressive but in addition grounded in point of fact. It doesn’t make sense to spend time coming up with the proper solution only to understand it’s not feasible to construct due to quite a lot of real world constraints reminiscent of resources, time, regulation, or constructing a seemingly perfect solution but afterward checking out you’ve made it too hard for purchasers to adopt.

From time to time we discover ourselves in a situation where we’d like to make a call but don’t have any data, and on this scenario we employ minimum testable hypotheses which give us a signal as as to if or not something is sensible to pursue with the smallest amount of energy expenditure.

All this combined is to present us agility, rapid iteration cycles to de-risk items quickly, and has helped us adjust strategies with high confidence, and make numerous progress on very hard problems in a really short period of time.

Initially, you were focused on edge AI, what caused you to refocus and pivot to cloud computing?

We began with edge AI because at the moment I used to be very focused on trying to resolve a really particular problem that I had faced in attempting to usher in a world of general purpose autonomous robotics. Autonomous robotics holds the promise of being the largest platform shift in our collective history, and it appeared like we had every little thing needed to construct a foundation model for robotics but we were missing the perfect inference chip with the suitable balance of throughput, latency, energy efficiency, and programmability to run said foundation model on.

I wasn’t excited about the datacenter at the moment because there have been good enough firms focusing there and I expected they’d figure it out. We designed a extremely powerful architecture for this application space and were on the point of tape it out, after which it became abundantly clear that the world had modified and the issue truly was within the datacenter. The speed at which LLMs were scaling and consuming compute far outstrips the pace of progress in computing, and once you consider adoption it starts to color a worrying picture.

It felt like that is where we must be focusing our efforts, to bring down the energy cost of AI in datacenters as much as possible without imposing restrictions on where and the way AI should evolve. And so, we started working on solving this problem.

Are you able to share the genesis story of Co-Founding Lemurian Labs?

The story starts in early 2018. I used to be working on training a foundation model for general purpose autonomy together with a model for generative multiphysics simulation to coach the agent in and fine-tune it for various applications, and another things to assist scale into multi-agent environments. But in a short time I exhausted the quantity of compute I had, and I estimated needing greater than 20,000 V100 GPUs. I attempted to lift enough to get access to the compute however the market wasn’t ready for that form of scale just yet. It did nevertheless get me excited about the deployment side of things and I sat all the way down to calculate how much performance I would wish for serving this model within the goal environments and I spotted there was no chip in existence that might get me there.

A few years later, in 2020, I met up with Vassil – my eventual cofounder – to catch up and I shared the challenges I went through in constructing a foundation model for autonomy, and he suggested constructing an inference chip that might run the inspiration model, and he shared that he had been pondering rather a lot about number formats and higher representations would assist in not only making neural networks retain accuracy at lower bit-widths but in addition in creating more powerful architectures.

It was an intriguing idea but was way out of my wheelhouse. However it wouldn’t leave me, which drove me to spending months and months learning the intricacies of computer architecture, instruction sets, runtimes, compilers, and programming models. Eventually, constructing a semiconductor company began to make sense and I had formed a thesis around what the issue was and the best way to go about it. And, then towards the top of the yr we began Lemurian.

You’ve spoken previously in regards to the have to tackle software first when constructing hardware, could you elaborate in your views of why the hardware problem is at the beginning a software problem?

What numerous people don’t realize is that the software side of semiconductors is far harder than the hardware itself. Constructing a useful computer architecture for purchasers to make use of and get profit from is a full stack problem, and should you don’t have that understanding and preparedness getting in, you’ll find yourself with a fantastic looking architecture that may be very performant and efficient, but totally unusable by developers, which is what is definitely essential.

There are other advantages to taking a software first approach as well, after all, reminiscent of faster time to market. That is crucial in today’s fast-paced world where being too bullish on an architecture or feature could mean you miss the market entirely.

Not taking a software first view generally ends in not having derisked the essential things required for product adoption available in the market, not with the ability to reply to changes available in the market for instance when workloads evolve in an unexpected way, and having underutilized hardware. All not great things. That’s an enormous reason why we care rather a lot about being software centric and why our view is you can’t be a semiconductor company without really being a software company.

Are you able to discuss your immediate software stack goals?

Once we were designing our architecture and excited about the forward looking roadmap and where the opportunities were to bring more performance and energy efficiency, it began becoming very clear that we were going to see rather a lot more heterogeneity which was going to create numerous issues on software. And we don’t just have to give you the option to productively program heterogeneous architectures, we’ve to cope with them at datacenter scale, which is a challenge the likes of which we haven’t encountered before.

This got us concerned since the last time we needed to undergo a significant transition was when the industry moved from single-core to multi-core architectures, and at the moment it took 10 years to get software working and other people using it. We are able to’t afford to attend 10 years to determine software for heterogeneity at scale, it needs to be sorted out now. And so, we started working on understanding the issue and what must exist to ensure that this software stack to exist.

We’re currently engaging with numerous the leading semiconductor firms and hyperscalers/cloud service providers and shall be releasing our software stack in the subsequent 12 months. It’s a unified programming model with a compiler and runtime able to targeting any form of architecture, and orchestrating work across clusters composed of various sorts of hardware, and is able to scaling from a single node to a thousand node cluster for the very best possible performance.

LEAVE A REPLY Cancel reply