I recently developed a frontend application to upload images into an interactive platform.
I wanted to share a small part of the application with you to show you how AI can really help in operations where content analysis is needed.
Specifically, in this case we are talking about images and Gemini was used to perform content detection and categorization.
This application I was talking about is implemented with React , but the piece of code I'm about to share with you can be used in any framework.
index.ts
In this example I purposely kept distracting things to a minimum, such as CSS and other secondary methods.
This component renders a form and then through the handleSubmit method read the value of the input file.
With this file the toBase64 converts it to base64 format.
This data is used to make an HTTP request to the Cloud Function which will receive the image we have chosen.
As mentioned, the backend will be implemented with a Cloud Function, it is very simple, you saw it in the previous lessons.
Without mincing words let's get straight to the point. Here is the code that will allow you to obtain the detection of the image with some extra information.
Here you see the use of gemini-1.5-flash model with this prompt:
"Can you detect this food and tell me who invented it please?"
index.ts
Let's explain this code, all the magic happens inside model.generateContent which accepts the object sent by us via the frontend.
mimeType: the media type of your image. eg. "image/png" or "image/jpeg"
It was so simple to implement the solution that I enjoyed doing some tests.
Here are the results:
✅ Right
This is a pizza. The inventor of the pizza is unknown, but it is believed to have originated in Naples, Italy.
The first documented use of the word "pizza" was in 10th century Naples.
However, pizzas as we know them today are a more recent invention, with the first pizza parlor opening in Naples in the 18th century.
✅ Correct
This is a stack of pancakes.
Pancakes are believed to have originated in ancient times, with various cultures having their own versions of flatbreads cooked on griddles.
The modern pancake as we know it is thought to have evolved in Europe, particularly in the medieval period.
It is a simple and versatile food that can be enjoyed for breakfast, brunch, or even dessert.
✅ Exact
This is a taco. There is significant debate about the origins of the taco in Mexico, with some arguing that the taco predates the arrival of the Spanish in Mexico.
There is anthropological evidence that indigenous people living in the Valley of Mexico lake region traditionally ate tacos filled with small fish.
You explored the fascinating world of food detection, specifically utilizing Gemini's powerful capabilities.
By walking through a practical example of identifying food items in an image, you learned how Gemini's advanced AI can analyze visual data and provide accurate classifications.
This technology has the potential to revolutionize various industries, from optimizing inventory management in restaurants to automating meal planning and dietary tracking for individuals.
While this is just a glimpse into the possibilities of Gemini's food detection abilities, it showcases its immense potential to simplify and enhance our daily lives.
As AI technology continues to evolve, you can expect even more innovative and transformative applications of Gemini's image analysis capabilities across diverse sectors.
With a little exploration and creativity, you too can unlock the power of AI to solve real-world problems and shape a smarter, more efficient future.