Google is making strides with its Gemini AI model, recently unleashing several updates and introducing new features aimed at enhancing user interaction and productivity. Earlier this month, the tech giant rolled out Gemini 1.5 with Deep Research, adding considerable capabilities to its suite of tools. This feature allows users to request research on specific topics, utilizing Gemini’s ability to scan dozens of websites and compile comprehensive reports based on the information gathered.
According to Google’s blogs about Vietnam and Thailand, these cutting-edge features are part of the broader rollout of the Gemini model to additional regions, enabling more users to access the latest capabilities. Specifically, upgrades from the Gemini 1.5 Pro and Flash models have transitioned to the Gemini 2.0 Flash Experimental and 2.0 Experimental Advanced models. Google describes these new offerings as catering to “complex tasks” for the Flash version, and “everyday help” for the Advanced version.
Gemini 1.5 with Deep Research stands out as the only model currently utilizing this innovative research feature. It allows users to collect prompts, develop research plans, and modify them before Gemini conducts the search. After examining dozens, sometimes hundreds of sites, it can produce thorough reports—efficiently saving users time on research tasks. Remarkably, once the task begins, users can step away from Gemini and even export the compiled report to Google Docs later on.
Expanding its horizon, Google has made the Gemini 1.5 with Deep Research available to Gemini Advanced subscribers across the US and other selected regions. Notably, this feature is not supported by the iOS or Android apps yet; it is accessible through the Gemini online app and can also be utilized via the mobile web app if necessary.
Additional updates to Gemini include the enhanced functionality of recognizing when PDF files are opened on users’ screens. This feature, reported by The Verge, allows Gemini to answer inquiries about the contents directly, significantly streamlining how users interact with their digital files.
When users view PDFs within the Files by Google app, they can summon Gemini and select the newly added “Ask about this PDF” button, enabling them to pose questions like “What’s the overview of this document?” or “Can you explain this section?” The AI then responds with detailed insights, effectively acting as a personal assistant to interpret the document.
This capability was first previewed during Google’s I/O developer conference held back in May 2024, and it marks another significant advancement for Gemini’s utility, which now extends beyond asking general questions about web pages and YouTube videos to include direct file analysis.
For documents or files without direct support for Gemini's features, the assistant maintains its effectiveness by allowing users to take screenshots and ask questions about the screen's content, broadening its applicability. Users can tap on “Ask about this screen” to elicit analyses based on any visible material, whether it's from browsing articles or viewing videos.
This integration positions Gemini as not only an assistant but also as a highly adept tool for facilitating the navigation of digital content across various devices. With the introduction of the PDF recognition feature, Google aims to refine the user experience by making document review more efficient and interactive.
To access these advanced features, users must subscribe to Gemini Advanced, Google’s premium AI assistant service. Although the rollout is still underway, these developments signal the increasing importance of AI-driven tools within popular applications like Files by Google, enhancing productivity and content management capabilities.
With Gemini’s latest functionalities, users can expect their workflows to become quicker, more efficient, and increasingly interactive, whether they are dealing with reports, reading PDFs for work, or tackling complex information retrieval tasks.