Google’s new Gemini AI model can now listen directly to audio files

1 2 minutes read

Okay, so for the past year and a half we’ve witnessed the rapid growth of generative AI (is it taking over the world?) and AI models are gaining more and more knowledge as we speak. Now, Android Headlines reports that Google’s new AI model, Gemini 1.5 Pro can now listen to audio.

Gemini can now listen to and understand audio files

Maybe you know but the more data you feed AI, the better it becomes (and freakier, if you’re one of the more skeptical people). At first, the training of the AI models was basically done via text – especially important for chatbots. However, AI models then learned to process image data, and can now be used to reconstruct an image (or create a whole new image upon your prompt). Gemini (which used to be called Bard for those of you who don’t know) has been able to process images, and now it’s growing towards audio format. The version that does that, Gemini 1.5 Pro, is currently in testing. This opens up a world of possibilities – like summaries of a long keynote, conversation, earnings call, lectures, and similar things. You’ll be able to upload the file to Gemini.

Tools to summarize long calls exist. But what they do is transcribe the call first and then summarize it. However, Gemini will listen to the call.

Don’t be quick to get excited though – for now, this won’t be available as a public release. For you to use it, you will need Google’s development platform Vertex AI or if you’re using AI Studio. It’s bound to make it to the public as well, but we don’t know when.
All in all, witnessing the growth of AI is seriously exciting. If you’re one of the people who fear it will rule the world one day – don’t be too scared. The way I see it – it’s here to make our lives easier and give us more space to fulfill our potential as intelligent and also intuitive and creative human beings. It will just ensure we won’t have to waste precious time with the boring stuff (like listening to a long earnings call, you know).

View Full Bio

Izzy, a tech enthusiast and a key part of the PhoneArena team, specializes in delivering the latest mobile tech news and finding the best tech deals. Her interests extend to cybersecurity, phone design innovations, and camera capabilities. Outside her professional life, Izzy, a literature master’s degree holder, enjoys reading, painting, and learning languages. She’s also a personal growth advocate, believing in the power of experience and gratitude. Whether it’s walking her Chihuahua or singing her heart out, Izzy embraces life with passion and curiosity.

Source link

Apple Pencil Pro is coming according to Apple’s Japan website

This year’s Met Gala theme is AI deepfakes

Coach has discounted tons of classic handbags — get them by Mother’s Day if you shop now

US officials probe allegations Boeing workers falsified inspection records | Aviation

Pure Audio Brings Catalog of Hi-Res and Immersive Recordings to Streaming

Pixel Watch May 2024 update rolling out now, here’s what’s new

Boeing Starliner’s first crewed mission scrubbed

Slum to stardom: Indonesian film director Joko Anwar is riding high | Cinema News

Qantas coughs up $79M for selling tickets on already canceled flights

Google gives you more control with updated 2FA setup process

Google’s new Gemini AI model can now listen directly to audio files

Gemini can now listen to and understand audio files

Real Hacker Staff

How to get & use jet-pack in Lethal Company

Twitch viewers baffled as new ‘topless’ meta goes viral on platform

5 things we learned from the Epic-Google antitrust case this week

Rocket Report: Beyond Gravity to study fairing reuse; North Korea launches satellite

These apps ruled over US residents’ screens and wallets in 2023

Endless Ocean’s Tetris 99 Event Starts Later This Week, Here’s A First Look

New Report Reveals that Modern Warfare 3 Development Time Was Half That of a Normal Call of Duty

The watermelon emoji isn’t just TikTok speak for Palestine

Quordle today – hints and answers for Friday, November 10 (game #655)

Humane’s Ai Pin up close

First Soundbar Powered by WiSA E Wireless Audio Software Now Available

Gemini can now listen to and understand audio files

Accsoon Unveils SeeMo 4K, transforming your iPhone into a field monitor

‘Overspeeding’truck plunges into ravine killing 17 pilgrims in Pakistan | News

Related Articles

Endless Ocean’s Tetris 99 Event Starts Later This Week, Here’s A First Look

New Report Reveals that Modern Warfare 3 Development Time Was Half That of a Normal Call of Duty

The watermelon emoji isn’t just TikTok speak for Palestine

Quordle today – hints and answers for Friday, November 10 (game #655)

Humane’s Ai Pin up close

First Soundbar Powered by WiSA E Wireless Audio Software Now Available