MarkItDown is an open-source Python utility that simplifies converting diverse file formats into Markdown, designed to prepare data for LLMs and RAG systems. It handles various file types, preserves document structure, and integrates with LLMs for tasks like image description.
Microsoft researchers introduce LongRoPE2, a method to extend large language model context windows to 128K tokens while maintaining over 97% short-context accuracy, addressing key limitations in positional embeddings.
Bing Web Search API enables safe, ad-free, location-aware search results, surfacing relevant information from billions of web documents. Help your users find what they're looking for from the world-wide-web by harnessing Bing's ability to comb billions of webpages, images, videos, and news with a single API call.
A Microsoft engineer demonstrates how WebAssembly modules can run alongside containers in Kubernetes environments, offering benefits like reduced size and faster cold start times for certain workloads.
Microsoft has open-sourced MarkItDown, a state-of-the-art application designed to convert various file types into Markdown format for seamless integration, collaboration, and accessibility. The tool supports multiple file formats, including PDFs, PowerPoint presentations, Word documents, Excel spreadsheets, images, audio, HTML, text-based formats, and ZIP files, making it a versatile utility for users across different domains.
Microsoft has released the OmniParser model on HuggingFace, a vision-based tool designed to parse UI screenshots into structured elements, enhancing intelligent GUI automation across platforms without relying on additional contextual data.
OpenRecall is an open-source software that aims to be a privacy-focused alternative to Microsoft's Recall feature. It captures the user's digital history, processes text and images using OCR, and allows users to find specific information by searching for relevant keywords. Currently, it stores data locally but does not encrypt it. It is available for Windows, macOS, and Linux.
This article provides a step-by-step guide on fine-tuning the Florence-2 model for object detection tasks, including loading the pre-trained model, fine-tuning with a custom dataset, and evaluating the model's performance.
A recent TechRadar poll found that Grammarly has emerged as a surprise hit among AI tools, with 584 monthly users. ChatGPT remains the most popular tool, while Microsoft Copilot and Google Gemini also showed strong results.
Microsoft has deployed GPT-4, a large language model, in an isolated, air-gapped Azure Government Top Secret cloud for use by the Department of Defense. Once accredited, Pentagon officials will be able to use the technology in a secure environment. The tool is expected to help DOD officials deal with vast amounts of data and simplify information sorting. Microsoft is a major investor in OpenAI, the maker of GPT-4 and the popular ChatGPT.