Member-only story
Microsoft’s AI: Florence-2 Computer Vision on my Glitch Western Art
“A man is wearing a white cowboy hat. He has a black jacket on and a silver necklace around his neck. There is a blue and white background behind the man.”
this is Microsoft’s new AI model Florence-2 interpreting my Glitch Western Art starring Emily Mercedes Rich as The Cowgirl. i love the gender-confusion Florence-2 is experiencing from our queering of the Western genre! on the otherhand the inaccuracies && lack of details is vry troubling for a feature called “More Detailed Caption.” when i switch to simply just “Caption” we get the more direct && accurate description “A man in a cowboy hat with a gun and playing cards in the background.” which detected her pistol && play’NN cards…
“Phrase grounding” is also one of the options. phrase grounding is an AI, NLP (Natural Language Processing) and CV (Computer vision) activity in which specific words or phrases (in text) are linked to areas of an image base on their corresponding regions or objects (within the image). this process “grounds” or connects language elements to visual elements, establishing “precise” relationships between (text-based) descriptions and areas of the image. the areas of the image are ‘seen’ by the compute powers as rectangles represented by coordinates that define the dimensions of these rectangular bounding boxes. those coordinates…