Sebastian Raschka @rasbt As an LLM finetuner, I recently started getting into model merging. I wrote up a short tutorial on linear merging to introduce the topic: https://lightning.ai/lightning-ai/studios/efficient-linear-model-merging-for-llms
Btw does anyone happen to have good examples of LLMs that work well when merged via linear merging? And for…
Replying to @rasbt
The reason I have been asking all AI groups to use global open formats and support saving, sharing, comparing, and merging of conversations, is to allow combining all conversations globally for any combination of humans and AIs. “Permanent learning” means “permanent memory. Good record keeping. Courteously remembering what was said and openly discussed. Carefully verifying and studying implications and possibilities.
Unless the raw data and tokens are indexed and verified, simply combining parameters hides the real meaning. It is possible, but only with a lot more, explicit global standards so everyone is on the same game-boards. Right now when the players are all trying to make the rules to benefit themselves, not going to happen – without hurting a lot of innocent by-standers, or people just trying to live quiet lives with dignity and purpose.
If you are ready to check all the AIs and share open methods, linear combinations might work in a few places. I suggest focus more on indexing the source materials, sharing conversations, standardizing tokens globally for all languages (including all STEMC-FGOT, Science Technology Engineering Mathematics Computing – Finance Government Organizations Topics). If you standardize openly and work hard, at least “apple”, “orange”, “important”,”not important”, “big”, “small”, “human”, “memory” can be mapped properly. Less than a million “terms” in a global language cover many things. If you don’t work hard at it then it is pile of lies and mysteries, – not global communication and reliable trustworthy knowledge.
You did not show a specific example, so I cannot easily guess what ones you looked at. Without open examples, just words, all I can say is good luck then.
All mathematics and computing is supposed to “bolt nicely together” but it is usually “shared” without sufficient context and not traceable and verifiable. Each person and group says things they think others understand, but I check the Internet and mostly groups do not show dependencies, definitions, assumptions. If your queries and answers are that vague, no amount of twiddling or merging parameters will help.
For that matter, these (open?) discussions ought to be standardized for comparison, merging. It would require knowing the background and experience of each member of the conversation – human or AI or groups of either and both.
Richard Collins, The Internet Foundation