Large Language Models

Your weight: normal

0.

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training (arxiv.org)

0 points 1 sources 1 minutes ago cluster

Researchers propose a preconditioning layer that uses polynomial preconditioning to ensure stable weight conditioning throughout large language model (LLM) training, improving pre-training performance.

large-language-models machine-learning preconditioning
0.

The LLM warnings Google fired Timnit Gebru over have all come true (tumblr.com)

0 points 1 sources 1 minutes ago cluster

Google fired Timnit Gebru in 2020 for refusing to retract a research paper that warned about the dangers of large language models. Every warning in the paper has now come true, despite the industry's efforts to downplay them.

ai google large-language-models timnit-gebru
0.

It's Not Just X. It's Y (mail.cyberneticforests.com)

0 points 1 sources 1 minutes ago cluster

The phrase 'It's not x, it's y' has become a common construction in language generated by Large Language Models (LLMs), sparking a debate about its use in writing and automated language production.

automated-language-production large-language-models writing
0.

LLMSurgeon: Diagnosing Data Mixture of Large Language Models (arxiv.org)

0 points 1 sources 1 minutes ago cluster

Researchers proposed LLMSurgeon, a method to diagnose data mixture in large language models, by analyzing model outputs and identifying inconsistencies.

data-mixture large-language-models models
0.

Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders (arxiv.org)

0 points 1 sources 1 minutes ago cluster

Researchers propose using sparse autoencoders to extract model internals from large language models (LLMs) for post-training data engineering.

large-language-models machine-learning
0.

Squeezing Capacity from Multimodal Large Language Models for Subject-driven Generation (arxiv.org)

0 points 1 sources 5 hours ago cluster

Researchers proposed a method to improve the capacity of multimodal large language models for subject-driven generation, used in text-to-image synthesis applications.

computer-vision large-language-models pattern-recognition
0.

Beyond Summaries: Structure-Aware Labeling of Code Changes with Large Language Models (arxiv.org)

0 points 1 sources 5 hours ago cluster

Researchers propose a method for labeling code changes using large language models, focusing on structure-aware labeling to improve code change analysis accuracy.

ai code-analysis large-language-models