AI Insider No. 33

Greetings, AI Insider readers. I keep saying this, but once again, there was a lot of news in the AI space this week. Google killed it again, this time adding new functionality to their Bard chatbot. And, as mentioned last week, Apple’s ground-breaking Vision Pro headset dropped,  packed with AI-driven features. Honestly, geek that I am, I can’t justify spending the $3,500 plus for accessories for one of these bad boys, but I’m having a blast watching the tech reviewers try them out. Onward.


Good News, Bad News:
Bard Now Generates Images

By Michelle Johnson, AI Insider

Google’s Bard now generates images thanks to a recent upgrade. Google’s Imagen generator, Imagen 2, launched on Bard Thursday, allowing users to enter a text prompt and get back images from the chatbot.

That’s a feature that Microsoft’s Bing (aka Copilot) and ChatGPT have had for a while now, so you could call this Google catching up, but Imagen’s been around since 2022. Why they’ve been sitting on it while their competitors took off is a mystery, but I digress.

The main advantage to having this feature included in chatbots is not having to click away to a separate image generator while you’re busy chatting with Bard. Or Bing. Or ChatGPT. For instance, I often ask ChatGPT to write something and then generate an image to go along with it.

How does Google’s new image maker stack up against the others? Answer: Pretty good, but it’s got a problem. Bard failed miserably at doing what it was asked, apparently due to guardrails set up to prevent it from generating problematic images.

Designed with safety in mind, Imagen 2 will not generate images of people by name (celebrities, for instance), and thanks to Google’s SynthID feature, all images are invisibly watermarked so that they can be identified as AI-generated. However, this beta needs some tweaking because it appears to be blocking fairly innocuous prompts.

This prompt is admittedly long, but there’s no clear reason why Bard would balk at it: “Please generate a close-up photograph of a middle-aged African American woman showcasing her expressive eyes and warm smile. The focus is sharp on her face, highlighting the texture of her skin and the details of her facial features. She has short, naturally curly hair that frames her face beautifully. The background is softly blurred, a neutral color that complements her skin tone without detracting from her. The lighting is soft and even, casting gentle shadows that accentuate her features, giving the image a serene and inviting feel. Taken on a high-resolution digital camera with a shallow depth of field to create a bokeh effect.”

Here’s Bard’s response:

After simplifying the prompt to “closeup portrait of a woman,” Bard came back with these:

On the first try with the longer prompt, ChatGPT’s Dalle-E 3 served up the image on the left. Three tries and multiple prompt tweaks later, Bard finally gave me the one on the right. Three tries. I guess Imagen 2 needs some tweaks itself.

(Dall-E 3, left, Bard’s Imagen 2, right.)

I don’t think that either bested the reigning champ, Midjourney, which responded to the prompt this way:

(Midjourney)

And Midjourney doesn’t push back when you ask for someone by name:

(Midjourney)

That’s either a blessing or a curse, considering what happened with Taylor Swift deep fakes last week.

At the same time that Google launched Imagen 2 for Bard, they also gave us a sneak peek at something in their Test Kitchen dubbed Image FX. IMHO, it’s more interesting than the “more of the same” that they released for Bard.

Why? See the image below and note how this prompt includes drop-downs to help you tweak it. Who else has that? Nobody that I’ve seen. (Adobe Firefly comes closest because it will pop up alternate versions of a prompt for you.)

(ImageFX)

So, this begs the question of why Google would play it safe instead of just dropping Image FX into Bard on Thursday. I guess it wasn’t ready in time for this update. But, frankly, neither was what they released, based on my test.

In any case, you can try Bard with image generation for free at bard.google.com.

2024 may be the year of AI video, but it appears that things are still popping for still images, too. Watch this space for updates.


(Apple)

Pricey New Apple Vision Pro
Packed with AI-driven Features

By Bard, for AI Insider

Apple Vision Pro officially went on sale on Friday, so let’s delve into some of the AI magic powering this ground-breaking device. 

From meticulously mapping your surroundings to understanding your eye contact in the real world, AI plays a critical role in Apple Vision Pro, bridging the physical and digital realms. 

Here’s a roundup of its AI-powered features:

Spatial Mapping and Tracking: The Vision Pro uses AI to create a real-time understanding of your surroundings. This includes mapping your environment, tracking your head and hand movements, and even your eye gaze. This allows for precise placement of virtual objects and interactions that feel natural and responsive.

3D Reconstruction and Image Processing: The Vision Pro’s 3D camera captures depth information, and AI algorithms process this data to reconstruct the real world in 3D. This enables features like spatial photos and videos, where you can move around within the captured scene and even view existing photos and videos in a fully immersive way.

Gesture Recognition and Control: Instead of traditional hand-held game controllers, the Vision Pro uses hand and finger gestures for interaction. AI algorithms identify and interpret these gestures, allowing users to navigate the interface, select objects, and manipulate virtual content.

EyeSight Feature: This unique feature uses AI to understand when you’re making eye contact with someone in the real world while wearing the headset. The goal is to maintain a sense of social presence and connection.

Personalized Spatial Audio: The Vision Pro creates a customized and immersive soundscape based on an individual’s head and ear shape. AI algorithms analyze this data to adjust the sound direction and create a realistic sense of spatial audio.

In summary, AI is deeply embedded in the Apple Vision Pro, powering its core features and aiming to create a seamless and immersive mixed-reality experience.


GPTs for Black History Month

In the Shameless Plug Department: I’ll start this off with GPTs created by me. Look out for others later in the month.

BlkChat: Chat with a famous black historical figure. Seriously, it’s fun. Ask W.E.B. DuBois where he was born. Find out if Sojourner Truth was married. Who’s Alice Coachman? Find out by asking her in BlkChat.

Black Journeys: Planning a trip? Black Journeys will map out an itinerary for visits to sites of importance to African-American history. Just tell it where you’re going and what you’re interested in.

— Michelle Johnson


Off the Beaten ChatGPT Track

Welcome to a new feature: “Off the Beaten ChatGPT Track,” where we introduce you to something useful other than ChatGPT. 

By Michelle Johnson and You.com, for AI Insider

Face it, we’ve all got ChatGPT on the brain. Despite the fact that there are other chatbots, AI Assistants, and apps out there, few of us spend the time looking at the alternatives. 

That’s a shame because some of the other tools are easier to use, and they even run on ChatGPT! You may find these “ChatGPT wrappers,” with nicer interfaces and extra features, more interesting or easier to use. 

This week, we’ll take a look at You.com, billed as an AI Search Assistant.

What better way to learn what it does than to ask it about itself with the prompt: “Explain to me how you work. What features do you have?”

You.com:YouChat is a language model-based assistant designed to provide assistance, information, and insights on a wide range of topics. As a user, you can interact with me by asking questions, seeking explanations, or requesting assistance with various tasks.” 

Here’s how the company explains the different modes that You.com offers:

GPT-4 – Run this mode for free, limited access, YouPro ($9.99/mo.), unlimited access. Student and teacher 30% discount for a YouPro for Education account.

Smart Mode (Free, unlimited): The free default mode on You.com offers quick, reliable responses with live web access, citations, and sources. Notably, You.com was the first consumer-facing LLM in late 2022 to provide real-time access to the internet, allowing for up-to-date answers with sources and citations.

Prompts to try:

  • Write a thank-you note after a job interview.
  • Best pizza in Brooklyn

Genius Mode (Free, limited, Paid, Unlimited): Offers multistep computational abilities combined with data visualization through charts and plots and file uploads (PDFs, text, images, etc.). Genius Mode makes these advanced capabilities accessible and enjoyable for users of all levels when it comes to AI.

Prompts to try:

  • If you give a baby $5,000 at birth to invest in a no-fee stock index fund and assume a 10% average annual return, how much would they have by age 65? Calculate the same investment for someone starting at ages 10, 20, 30, 40, 50, and 60.
  • Upload a PDF and request, ‘Give me five key insights.’

Research Mode (Free, limited, Paid, Unlimited): Provides comprehensive yet digestible reports with extensive source citations and references, including for real-time news events. Additionally, its ability to create comparative tables makes it an ideal tool for tasks like decision-making.

Prompts to try:

  • Explain the background, action, and consequences of the Peloponnesian War.
  • Create a table for top noise-cancelling headphones that are not expensive.

Create Mode (Free, limited, Paid, Unlimited): Transforms any concept into an AI image in an unlimited array of styles or a chart or plot. Envision it as a blank canvas solely dedicated to AI artistry. It eliminates the need for prompt engineering command inputs like “draw” or “create,” making the process more seamless and intuitive.

Prompts to try:

  • A laptop with 3-D paint splatters all over it.
  • A surreal image of papers peacefully floating up into a golden sky with puffy clouds.

Random Shorts

Bard Update Expanded: Yeah, Bard again. Bard, with the upgraded Gemini Pro language model, is now available in more than 40 languages and 230 countries and territories.

ChatGPT Adds GPT Feature: If you have a ChatGPT Plus account, you’re probably familiar with the extended features you can use via GPTs. This week OpenAI made it possible to incorporate GPTs into your prompt by typing @[prompt name here]. 

FCC Hanging Up on AI Robocalls: The FCC wants to criminalize those faked AI-generated robocalls, like the one that cloned Biden’s voice during the NH primary.

Arc, the web browser you’ve probably never heard of, now uses AI to browse the web for you. How does that work? You chat with it, and it will find info for you. It doesn’t just return a list of links. It generates a little web page with commentary and links. Yes, really. And there’s a mobile version, too. “Browse for Me” could be what’s next in search.

Attention AI Shoppers: Amazon is beta-testing an AI-powered shopping assistant called Rufus. It’s been trained on Amazon’s catalog of products, user reviews, and general web info.


Aht Gallery

This week’s theme: The Grapes of Math. See how image algorithms interpret the prompt: “Grapes sitting in a bowl on a wooden table, next to a vase of white flowers.” If you’d like to try this for yourself, here are the bots used: Dream Studio, Adobe Firefly, Dall-E 3.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.