I’m trying to build an app and I need general photo analysis- I’m managing to connect yo the Google cloud Vision API but it gets pretty confused easily. The one used by Bing and GPT is much better (I wonder if they use the Microsoft Azure model?) - does anyone have experience analysing photographs? I’m trying to get scene description so I can batch send them to gpt for somewhat accurate descriptions.
[link] [comments]