* When using CuPL prompts instead of the standard prompts from OpenAI, the zero-shot accuracy is 80.3%. When evaluating at 280x280 and changing resize to squash, Ross Wightman found the model achieves 80.4%. ** In addition to natural language supervision, CoCa uses synthetic captions constructed with the labels from the JFT-3B dataset. In addition to natural language supervision, CoCa uses synthet