Toronto-based generative artificial intelligence (AI) startup Viggle AI has secured nearly $26 million CAD ($19 million USD) in Series A funding to fuel the growth of its platform, which uses AI to help users create videos from simple text and image prompts.
Viggle launched its app this March, and the capabilities of its tech went viral shortly thereafter when videos began circulating online of Joaquin Phoenix’s Joker persona replacing rapper Lil Yachty’s stage entrance at the Summer Smash Festival in 2021. This turned into a trending meme format this April, when social media users began using Viggle to insert other celebrities and characters into the same video.
TSFV’s Eva Lau characterized Viggle’s recent growth as “whiplash-inducing.”
This helped the early-stage AI startup amass a community of more than 4.3 million members on the messaging platform Discord, and secure fresh financing from Silicon Valley’s Andreessen Horowitz (a16z) and Toronto-based Two Small Fish Ventures (TSFV).
In an interview with BetaKit, Viggle co-founder and CEO Hang Chu said that Viggle was “really, really excited by this virality,” noting that its product growth made closing this round “a little bit easier.”
Viggle’s all-equity, all-primary Series A round closed earlier this month, and was led by a16z with support from fellow new investor TSFV and other undisclosed backers. Chu declined to disclose the startup’s valuation or the details of its prior financings to BetaKit. This round brings Viggle’s total funding to more than $27 million CAD ($20 million USD). The startup plans to put its latest capital towards developing a stronger model, adding new capabilities, and improving its user experience.
“There’s a lot of text-to-video generators out there, but they are mainly pixel-based models [that] are hard to control and hard to really precisely edit what is happening in the video,” Chu said. “What we’re doing differently is we emphasize the controllability side.”
According to Chu, one of the things that differentiates Viggle is its proprietary JST-1 tech, a video-3D foundational model that incorporates knowledge of physics to support the creation of more lifelike character movements and expressions. Using text and existing photos or videos, Viggle users can specify a character and a type of motion for them to do, and with the help of AI, Viggle will generate an animated video based on these requests.
Viggle’s existing software, which is available both for free and on a paid Pro subscription basis with additional capabilities for $9.99 USD per month, caters to a variety of users, helping content creators and everyday users quickly generate animated character videos from prompts, and streamlining the ideation and pre-production process for professional animation engineers, game designers, and visual effects artists.
According to Chu, Viggle currently has two main types of users: folks using it as a new tool for making memes, and professionals ranging from creators to workers at movie and game studios leveraging it as a content-making and visualization tool. To support the latter group, Viggle has recently launched a Creator Program, a free initiative that comes with a Pro subscription, an additional 1,000 credits—equivalent to 250 minutes of video—early access to new features, and opportunities to connect with fellow creators, among other things.
RELATED: Two Small Fish holds $41-million CAD final close for Fund III
Chu acknowledged that the AI video generation market has become competitive, but claimed that what Viggle is doing in terms of controllable video generation is unique currently. While general-purpose text-to-video models are good at the ideation phase, he argued that today, Viggle’s model “really shines in the post-editing” phases.
Chu is a former PhD candidate at the University of Toronto who studied computer vision and machine learning under Waabi’s Raquel Urtasun and Nvidia’s Sanja Fidler. He also previously worked as a researcher at Autodesk, Facebook, Nvidia, and Google.
TSFV co-founder and general partner Eva Lau described Chu to BetaKit as “an exceptional founder with deep technical expertise,” adding that the tech that underpins Viggle’s JST-1 foundational model is also “extremely unique and very difficult for others to replicate.”
“We’re essentially building a graphics engine with neural networks,” Chu said. While Viggle has started with a character model, he noted that over time, the startup plans to layer on more capabilities, including the ability to generate objects, character-object interaction, and eventually entire scenes.
Chu likened the startup’s existing offering to a prototype. “This is a proof of concept that this can work,” he said, adding that the startup is training stronger models to contend with more complex requests and improve the quality of the videos it can generate, which is limited at the moment, with shaky character movement and unchanging facial expressions.
Asked whether he sees Viggle’s tech replacing or supporting work done by humans, Chu claimed, “What we’re trying to do is all about empowering and augmenting the creators rather than replacing the creativity.” He asserted that the startup’s focus is on providing users with tools to streamline the animated video creation process—not automating it altogether.
But the tech underlying AI-generated videos, also called deepfakes, can also be dangerous. At its worst, it can be used to create misinformation and non-consensual pornography, and power scams. When asked how Viggle is navigating the potential for its product to facilitate copyright infringement and abuse, Chu said the startup has “community guidelines, policies, and terms in place that users need to make sure they have permission to use what they choose to upload.”
“We have implemented moderation mechanisms to protect against abuse, such as [not safe for work] content and political figures,” Chu added, noting that this is an area of focus for Viggle. “We are actively working on this, including a reporting and takedown process, to deal with potential copyright infringement and abuse.”
On the copyright side of the equation, when asked what data Viggle’s AI video models are trained on, Chu told BetaKit that the startup uses public sources. “Viggle leverages a variety of public sources to generate AI content,” he said. “Our training data has been carefully curated and refined, ensuring compliance with all terms of service throughout the process.” It is unclear what “public” means in this context, but given how other tech companies have approached training AI systems, it may refer to availability rather than rights to use said data.
Chu’s answer shares some resemblance to what OpenAI CTO Mira Murati said about the training data used for its text-to-video model, Sora, when she told The Wall Street Journal the company uses “publicly available data” as well as licensed data. For his part, Chu told TechCrunch that Viggle’s training data set included YouTube videos, an admission that a company spokesperson tried to backtrack before eventually confirming to TechCrunch.
This may create issues for Viggle. Earlier this year, when asked about OpenAI potentially using YouTube to train Sora, YouTube CEO Neal Mohan told Bloomberg that using YouTube videos to train an AI text-to-video generator would be a “clear violation” of YouTube’s terms of service.
At the same time, Viggle is not alone. According to Wired, lots of other AI model developers, including Apple, Anthropic, and Nvidia, have also used YouTube videos as AI training fodder.
RELATED: Radical Ventures launches $800-million USD AI growth fund
Earlier this month, Nvidia was hit with a class-action lawsuit from YouTube creators for training its models on their content, after a 404 Media investigation found that Nvidia scraped massive amounts of video from the platform and uncovered internal Nvidia conversations showing that workers knew the action might not be legal.
Viggle’s rapid growth has come with its own challenges, from managing its fast-growing Discord to high demand leading to longer wait times in some instances. Chu noted that the startup is working to ensure it has the capacity necessary to meet users’ needs and credited Discord and its moderators for their help in overseeing its large online community.
TSFV, which targets early-stage deep tech startups, announced the final close of its third, $41-million CAD fund two months ago. Viggle joins a TSFV portfolio that also includes fellow Toronto AI startup Ideogram, a Midjourney competitor that secured its own $80-million USD Series A round and launched its latest text-to-image model in February.
The venture capital (VC) firm was created by husband-and-wife team Allen and Eva Lau, former leaders at Wattpad, a Toronto-based social storytelling platform that was acquired by South Korea’s Naver in 2021 for more than $754 million CAD. Allen Lau, Wattpad’s co-founder and former CEO, is now TSFV’s operating partner, while Eva Lau, previously Wattpad’s head of community and content, currently steers TSFV.
RELATED: Midjourney competitor Ideogram closes $80-million Series A, launches latest text-to-image model
AI has also been a focus for a16z as of late. In April, the stage-agnostic VC giant closed $7.2 billion USD for its newest set of funds, including some big bets on AI with $1.25 billion dedicated to AI infrastructure and $1 billion for AI apps. The firm has also reportedly been building a stash of sought-after AI chips to win deals in the space.
As the broader tech market has cooled amid the macroeconomic downturn, and many other startups and fund managers have struggled to raise capital, AI has remained hot. These conditions have benefitted Canadian firms like Radical Ventures and startups like Viggle, Ideogram, and Toronto-based large-language model developer and OpenAI rival Cohere.
However, since the AI funding frenzy began after OpenAI’s release of ChatGPT in late 2022, some investors have become more wary of AI and how much many companies are spending on it, and ChatGPT’s growth has flatlined. In June, Thomson Reuters chief product officer David Wong told BetaKit that “The reality check is happening now,” as enterprise buyers have become more discerning and started coming to terms with where AI is working and where it is not.
As the broader tech market has cooled, AI has remained hot.
But Viggle’s investors are bullish on the startup’s prospects given its approach and progress to date. In a statement, a16z partner Justine Moore noted that the firm has been impressed by Viggle’s early momentum and the user base it has built in a matter of months.
Eva Lau characterized Viggle’s recent growth as “whiplash-inducing.” She claimed TSFV spotted Viggle and began discussions with the startup in March when it had thousands of users.
She claimed that Viggle’s JST-1 tech is “the first-in-the-world 3D-video foundation model with actual physics understanding,” and argued it “gives Viggle AI a huge first-mover advantage.”
Allen Lau, who has joined Viggle as an advisor, told BetaKit that he sees room for Viggle to become more than just a simple meme generator and “cause massive disruption in the content creation space.” The tech entrepreneur-turned-investor plans to lean on his decades of experience building and leading Wattpad and other startups to support Viggle, which he argued “is going to be the next big Canadian AI superstar.”
UPDATE (08/26/24): This story has been updated to include additional responses from Viggle AI co-founder and CEO Hang Chu and context about the risks associated with deepfakes.
Feature image courtesy Viggle AI.