Lior Hakim, Co-founder and Chief Technical Officer of Hour One, an industry leader in crafting virtual humans for professional video communications. The lifelike virtual characters, modeled exclusively after real people, convey human-like expressiveness through text, empowering businesses to elevate their messaging with unmatched ease and scalability.
Could you share the genesis story behind Hour One?
The origin of Hour One can be traced back to my involvement in the crypto domain. Post that endeavor I began pondering what would be the next big thing that mass cloud compute can tap into and as machine learning was gaining popularity in recommendations and predictive analytics I was working on a few ML infrastructure related projects. Through this work I got familiar with early generative works and was especially interested in GANs at that time. I was using all the compute I could get my hands on to test those then-new technologies. When showing my results to a friend who had a company in the field he told me I must meet Oren. When I asked why, he told me that maybe both of us will stop wasting his time and waste each other’s time. Oren, my co-founder and CEO of Hour One was an early investor in AI at that time. and while we stood in different places we were both moving in the same direction, and the founding of Hour One to be the Home of the Virtual Human was an inevitable journey.
What are some of the machine learning algorithms that are used, and what part of the process is Generative AI?
In the realm of video creation, machine learning algorithms are instrumental at every stage. At the scripting phase, Large Language Models (LLMs) offer invaluable support, crafting or refining content to ensure compelling narratives. As we move to audio, Text-to-Speech (TTS) algorithms morph text into organic, emotive voices. Transitioning to the visual representation, our proprietary Multimodal foundational model of the virtual human takes center stage. This model, enhanced with Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), is adept at conveying contextual emotions, enunciation, and an articulated, captivating, and authentic delivery. Such generative techniques turn text and audio cues into lifelike visuals of virtual humans, leading to hyper-realistic video outputs. The orchestration of LLMs, TTS, GANs, VAEs, and our Multimodal model makes Generative AI not just a part but the backbone of modern video production.
How does Hour One differentiate itself from competing video generators?
At Hour One, our distinction from other video generators doesn't stem from a preoccupation with competition, but rather from a deeply rooted philosophy governing our approach to quality, product design, and market strategy. Our guiding principle is to always prioritize the human element, ensuring our creations resonate with authenticity and emotion. We take pride in delivering the best quality in the industry without compromise. By utilizing advanced 3D video rendering, we provide our users with a genuine cinematic experience. Furthermore, our strategy is uniquely opinionated; we begin with a polished product and then rapidly iterate towards perfection. This approach ensures that our offerings are always a step ahead, setting new benchmarks in video generation.
With your extensive background in GPUs, can you share with us some insights on your views on NVIDIA Next-Generation GH200 Grace Hopper Superchip Platform?
The Grace Hopper architecture is truly a game changer. If GPU can effectively work from its host’s RAM without completely bottlenecking the calculation, it unlocks currently impossible model/accelerator ratios in training, and as a result, much desired flexibility in training job sizes. Assuming the entire stock of GH200 will not be gulped by LLM training, we hope to use it to greatly reduce prototyping costs for our multi-modal architectures down the line.
Are there any other chips that are currently on your radar?
Our main goal is to provide the user with video content that is price competitive. Given the demand for large memory GPUs at the moment, we are constantly optimizing and trying out any GPU cloud offering on the top cloud service providers. Moreover, we strive to be at least partially platform independent on some of our workloads. Thus we are eyeing TPUs and other ASICs, and also paying close attention to AMD. Eventually any hardware-led optimization route that can result in better FLOPs/$ ratio will be explored.
What is your vision for future advancements in video generation?
In 24 months we won't be able to tell a generated human from a captured one. That will change a lot of things, and we are here at the forefront of those advancements.
At the moment most generated videos are for computers and mobile devices, what needs to change before we have photo realistic generated avatars and worlds for both augmented reality and virtual reality?
As of now, we possess the capability to generate photo-realistic avatars and worlds for both augmented reality (AR) and virtual reality (VR). The primary obstacle is latency. While the delivery of high-quality, real-time graphics to edge devices such as AR and VR headsets is vital, achieving this seamlessly is contingent upon several factors. Foremost, we're reliant on advancements in chip manufacturing to ensure faster and more efficient processing. Alongside this, optimizing power consumption is crucial to ensure longer usage without compromising the experience. Last but not least, we anticipate software breakthroughs that can efficiently bridge the gap between generation and real-time rendering. As these elements come together, we'll see a surge in the utilization of photo-realistic avatars and environments across both AR and VR platforms.
What do you expect to be the next big breakthrough in AI?
When it comes to the next significant breakthrough in AI, there's always an air of excitement and anticipation. While I've alluded to some advancements earlier, what I can share is that we are actively working on several groundbreaking innovations at this very moment. I'd love to delve into specifics, but for now, I encourage everyone to keep an eye on our upcoming releases. The future of AI holds immense promise, and we're thrilled to be at the forefront of these pioneering efforts. Stay tuned!
Is there anything else that you would like to share about Hour One?
You should definitely check out our discord channel and API, new additions to our platform offering at Hour One.
The post Lior Hakim, Co-founder & CTO of Hour One – Interview Series appeared first on Unite.AI.