• jarfil@beehaw.org
    link
    fedilink
    arrow-up
    1
    ·
    3 days ago

    LLMs use a tokenizer stage to convert input data into NN inputs, then a de-tokenizer at the output.

    Those tokens are not limited to “human language”, they can as well be positions, orientations, directions, movements, etc. “Body language”, or the flight pattern of a bee, are as tokenizable as any other input data.

    The concepts a dolphin language may have, no matter what they are, could then be described in a human language, and/or matched to human words for the same description.