LLMs use a tokenizer stage to convert input data into NN inputs, then a de-tokenizer at the output.
Those tokens are not limited to “human language”, they can as well be positions, orientations, directions, movements, etc. “Body language”, or the flight pattern of a bee, are as tokenizable as any other input data.
The concepts a dolphin language may have, no matter what they are, could then be described in a human language, and/or matched to human words for the same description.
LLMs use a tokenizer stage to convert input data into NN inputs, then a de-tokenizer at the output.
Those tokens are not limited to “human language”, they can as well be positions, orientations, directions, movements, etc. “Body language”, or the flight pattern of a bee, are as tokenizable as any other input data.
The concepts a dolphin language may have, no matter what they are, could then be described in a human language, and/or matched to human words for the same description.