Apple’s MM1: A multimodal LLM model capable of interpreting both images and text data

A team of computer scientists and engineers at Apple has developed an LLM model that the company claims can interpret both images and data. The group has posted a paper to the arXiv preprint server describing their new MM1 family of multimodal models and test results.
文 » A