[ad_1]
Apple may be lifeless final within the AI race—at the least when you think about competitors from firms like OpenAI, Google, and Meta—however that does not imply the corporate is not engaged on the tech. The truth is, it appears many of the work Apple does on AI is behind the scenes: Whereas Apple Intelligence is, nicely, there, the corporate’s researchers are engaged on different methods to enhance AI fashions for everybody, not simply Apple customers. The newest mission? Bettering AI picture editors based mostly on textual content prompts.
In a paper printed final week, researchers launched Pico-Banana-400K, a dataset of 400,000 “text-guided” photos chosen to enhance AI-based picture enhancing. Apple believes its picture dataset improves upon present units by together with increased high quality photos with extra variety: The researchers discovered that present datasets both use photos produced by AI fashions, or usually are not assorted sufficient, which may hinder efforts to enhance the fashions.
Funnily sufficient, Pico-Banana-400K is designed to work with Nano Banana, Google’s picture enhancing mannequin. Researchers say utilizing Nano Banana, their dataset can generate 35 various kinds of edits, in addition to faucet into Gemini-2.5-Professional to asses high quality the edits, and whether or not these edits ought to stay as a part of the general dataset.
As a part of these 400,000 photos, there are 258,000 samples of single edits (the place Apple compares the unique photos to 1 with edits); 56,000 “desire pairs,” which distinguishes between failed and profitable edit generations; and 72,000 “multi-turn sequences,” which walks by way of two to 5 edits.
Researchers observe that totally different capabilities had totally different success charges on this dataset. International edits and stylization are “simple,” reaching the very best success charges; object semantics and scene context are “reasonable;” whereas exact geometry, structure, and typography are “arduous.” The very best performing perform, “sturdy creative type switch,” which may embrace altering a picture’s type to “Van Gogh” or anime, has a 93% success charge. The bottom performing perform, “change font type or coloration of seen textual content if there may be textual content,” solely succeeded 58% of the time. Different examined capabilities embrace “add new textual content” (67% success charge), “zoom in” (74% success charge), and “add movie grain or classic filter” (91% success charge).
In contrast to lots of Apple’s merchandise, that are usually closed to the corporate’s personal platforms, Pico-Banana-400K is open for all researchers and AI builders to make use of. It is cool to see Apple researchers contributing to open analysis like this, particularly in an space Apple is mostly behind in. Will we truly get an AI-powered Siri anytime quickly? Unclear. However it’s clear Apple is actively engaged on AI, maybe simply in its personal manner.
[ad_2]