Scholar Articles

Computer Science

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement ...

January 22, 2025

Personalized Wireless Federated Learning for Large Language Models

Large Language Models (LLMs) have revolutionized natural language processingtasks. However, their deployment in wireless networks still face challenge...

April 20, 2024

Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction

Document-level Relation Triplet Extraction (DocRTE) is a fundamental task ininformation systems that aims to simultaneously extract entities with sema...

January 24, 2024

3D Vision-Language Gaussian Splatting

Recent advancements in 3D reconstruction methods and vision-language modelshave propelled the development of multi-modal 3D scene understanding, which...

October 10, 2024

Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Image editing has advanced significantly with the introduction oftext-conditioned diffusion models. Despite this progress, seamlessly addingobjects to...

April 28, 2024

Raidar: geneRative AI Detection viA Rewriting

We find that large language models (LLMs) are more likely to modifyhuman-written text than AI-generated text when tasked with rewriting. Thistendency ...

January 23, 2024

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Large Language Models (LLMs) are rapidly surpassing human knowledge in manydomains. While improving these models traditionally relies on costly humand...

July 28, 2024

latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction

We present latentSplat, a method to predict semantic Gaussians in a 3D latentspace that can be splatted and decoded by a light-weight generative 2Darc...

March 24, 2024

BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models

Generative Large Language Models (LLMs) have made significant strides acrossvarious tasks, but they remain vulnerable to backdoor attacks, where speci...

August 23, 2024

Do language models plan ahead for future tokens?

Do transformers "think ahead" during inference at a given position? It isknown transformers prepare information in the hidden states of the forward pa...

April 1, 2024

Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation

Comprehensive clinical documentation is crucial for effective healthcaredelivery, yet it poses a significant burden on healthcare professionals,leadin...

May 28, 2024

Self-Discover: Large Language Models Self-Compose Reasoning Structures

We introduce SELF-DISCOVER, a general framework for LLMs to self-discover thetask-intrinsic reasoning structures to tackle complex reasoning problems ...

February 6, 2024

HyperFast: Instant Classification for Tabular Data

Training deep learning models and performing hyperparameter tuning can becomputationally demanding and time-consuming. Meanwhile, traditional machinel...

February 22, 2024

AgentReview: Exploring Peer Review Dynamics with LLM Agents

Peer review is fundamental to the integrity and advancement of scientificpublication. Traditional methods of peer review analyses often rely onexplora...

June 18, 2024

OpenDataLab: Empowering General Artificial Intelligence with Open Datasets

The advancement of artificial intelligence (AI) hinges on the quality andaccessibility of data, yet the current fragmentation and variability of datas...

June 4, 2024

AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments

Evaluating large language models (LLM) in clinical scenarios is crucial toassessing their potential clinical utility. Existing benchmarks rely heavily...

May 13, 2024

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

We introduce Groma, a Multimodal Large Language Model (MLLM) with groundedand fine-grained visual perception ability. Beyond holistic imageunderstandi...

April 19, 2024

Reinforcement Learning for Collision-free Flight Exploiting Deep Collision Encoding

This work contributes a novel deep navigation policy that enablescollision-free flight of aerial robots based on a modular approach exploitingdeep col...

February 6, 2024

Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations

The robustness of recent Large Language Models (LLMs) has become increasinglycrucial as their applicability expands across various domains and real-wo...

April 22, 2024

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

The math abilities of large language models can represent their abstractreasoning ability. In this paper, we introduce and open-source our mathreasoni...

February 9, 2024

Understanding Robustness of Visual State Space Models for Image Classification

Visual State Space Model (VMamba) has recently emerged as a promisingarchitecture, exhibiting remarkable performance in various computer visiontasks. ...

March 16, 2024

Ethical and social risks of harm from Language Models

This paper aims to help structure the risk landscape associated withlarge-scale Language Models (LMs). In order to foster advances in responsibleinnov...

December 8, 2021

Audio Anti-Spoofing Detection: A Survey

The availability of smart devices leads to an exponential increase inmultimedia content. However, the rapid advancements in deep learning have givenri...

April 22, 2024

WPO: Enhancing RLHF with Weighted Preference Optimization

Reinforcement learning from human feedback (RLHF) is a promising solution toalign large language models (LLMs) more closely with human values. Off-pol...

June 17, 2024

T3: Transparent Tracking Triggering for Fine-grained Overlap of Compute Collectives

Large Language Models increasingly rely on distributed techniques for theirtraining and inference. These techniques require communication across devic...

January 30, 2024

Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models

While many have shown how Large Language Models (LLMs) can be applied to adiverse set of tasks, the critical issues of data contamination andmemorizat...

April 9, 2024

Multi-perspective Improvement of Knowledge Graph Completion with Large Language Models

Knowledge graph completion (KGC) is a widely used method to tackleincompleteness in knowledge graphs (KGs) by making predictions for missinglinks. Des...

March 4, 2024

Understanding deep learning requires rethinking generalization

Despite their massive size, successful deep artificial neural networks canexhibit a remarkably small difference between training and test performance....

November 10, 2016

ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification

KV cache stores key and value states from previous tokens to avoidre-computation, yet it demands substantial storage space, especially for longsequenc...

May 23, 2024

CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Student and Educator Needs

Timely, personalized feedback is essential for students learning programming.LLM-powered tools like ChatGPT offer instant support, but reveal direct a...

January 20, 2024

RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer

In this report, we present RT-DETRv2, an improved Real-Time DEtectionTRansformer (RT-DETR). RT-DETRv2 builds upon the previous state-of-the-artreal-ti...

July 24, 2024

Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?

Neural Radiance Field (NeRF) has achieved superior performance for novel viewsynthesis by modeling the scene with a Multi-Layer Perception (MLP) and a...

March 10, 2024

Gemini: A Family of Highly Capable Multimodal Models

This report introduces a new family of multimodal models, Gemini, thatexhibit remarkable capabilities across image, audio, video, and textunderstandin...

December 19, 2023

Large Language Model with Graph Convolution for Recommendation

In recent years, efforts have been made to use text information for betteruser profiling and item characterization in recommendations. However, textin...

February 14, 2024

WildGaussians: 3D Gaussian Splatting in the Wild

While the field of 3D scene reconstruction is dominated by NeRFs due to theirphotorealistic quality, 3D Gaussian Splatting (3DGS) has recently emerged...

July 11, 2024

Language Models for Code Completion: A Practical Evaluation

Transformer-based language models for automatic code completion have showngreat promise so far, yet the evaluation of these models rarely uses real da...

February 25, 2024

Transcriptomics-guided Slide Representation Learning in Computational Pathology

Self-supervised learning (SSL) has been successful in building patchembeddings of small histology images (e.g., 224x224 pixels), but scaling thesemode...

May 19, 2024

AI and personalized learning: bridging the gap with modern educational goals

Personalized learning (PL) aspires to provide an alternative to theone-size-fits-all approach in education. Technology-based PL solutions haveshown no...

April 3, 2024

On the Properties of Neural Machine Translation: Encoder-Decoder Approaches

Neural machine translation is a relatively new approach to statisticalmachine translation based purely on neural networks. The neural machinetranslati...

September 3, 2014

VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections

Graph transformer has been proven as an effective graph learning method forits adoption of attention mechanism that is capable of capturing expressive...

March 24, 2024

How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation

Conversational Recommender System (CRS) interacts with users through naturallanguage to understand their preferences and provide personalizedrecommend...

March 25, 2024

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

Prompt ensembling of Large Language Model (LLM) generated category-specificprompts has emerged as an effective method to enhance zero-shot recognition...

March 18, 2024

Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages

The development of Large Language Models (LLMs) relies on extensive textcorpora, which are often unevenly distributed across languages. This imbalance...

April 17, 2024

Visibility into AI Agents

Increased delegation of commercial, scientific, governmental, and personalactivities to AI agents – systems capable of pursuing complex goals withlimi...

January 23, 2024

Compression Represents Intelligence Linearly

There is a belief that learning to compress well will lead to intelligence.Recently, language modeling has been shown to be equivalent to compression,...

April 15, 2024

Dual Operating Modes of In-Context Learning

In-context learning (ICL) exhibits dual operating modes: task learning, i.e.,acquiring a new skill from in-context samples, and task retrieval, i.e.,l...

February 29, 2024

3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations

Imitation learning provides an efficient way to teach robots dexterousskills; however, learning complex skills robustly and generalizablely usuallycon...

March 6, 2024

OPEN TEACH: A Versatile Teleoperation System for Robotic Manipulation

Open-sourced, user-friendly tools form the bedrock of scientific advancementacross disciplines. The widespread adoption of data-driven learning has le...

March 12, 2024

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Large language models (LLMs) achieve remarkable performance in naturallanguage understanding but require substantial computation and memoryresources. ...

May 23, 2024

mDPO: Conditional Preference Optimization for Multimodal Large Language Models

Direct preference optimization (DPO) has shown to be an effective method forlarge language model (LLM) alignment. Recent works have attempted to apply...

June 17, 2024

DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences

Recent advancements in foundation models (FMs) have unlocked new prospects inautonomous driving, yet the experimental settings of these studies arepre...

June 5, 2024

Depth-aware Test-Time Training for Zero-shot Video Object Segmentation

Zero-shot Video Object Segmentation (ZSVOS) aims at segmenting the primarymoving object without any human annotations. Mainstream solutions mainly foc...

March 7, 2024

Uncertainty Quantification on Clinical Trial Outcome Prediction

The importance of uncertainty quantification is increasingly recognized inthe diverse field of machine learning. Accurately assessing model prediction...

January 7, 2024

The Interspeech 2024 Challenge on Speech Processing Using Discrete Units

Representing speech and audio signals in discrete units has become acompelling alternative to traditional high-dimensional feature vectors.Numerous st...

June 11, 2024

A Survey On Text-to-3D Contents Generation In The Wild

3D content creation plays a vital role in various applications, such asgaming, robotics simulation, and virtual reality. However, the process islabor-...

May 15, 2024

MileBench: Benchmarking MLLMs in Long Context

Despite the advancements and impressive performance of Multimodal LargeLanguage Models (MLLMs) on benchmarks, their effectiveness in real-world,long-c...

April 29, 2024

Do Membership Inference Attacks Work on Large Language Models?

Membership inference attacks (MIAs) attempt to predict whether a particulardatapoint is a member of a target model's training data. Despite extensiver...

February 12, 2024

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning

Remarkable progress on English instruction tuning has facilitated theefficacy and reliability of large language models (LLMs). However, thereremains a...

March 26, 2024

Generative Pretrained Hierarchical Transformer for Time Series Forecasting

Recent efforts have been dedicated to enhancing time series forecastingaccuracy by introducing advanced network architectures and self-supervisedpretr...

February 26, 2024

How to use and interpret activation patching

Activation patching is a popular mechanistic interpretability technique, buthas many subtleties regarding how it is applied and how one may interpret ...

April 23, 2024

To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering

Medical open-domain question answering demands substantial access tospecialized knowledge. Recent efforts have sought to decouple knowledge frommodel ...

March 4, 2024

Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities

Multimodal sentiment analysis (MSA) aims to understand human sentimentthrough multimodal data. Most MSA efforts are based on the assumption ofmodality...

April 25, 2024

Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales

Although social media platforms are a prominent arena for users to engage ininterpersonal discussions and express opinions, the facade and anonymityof...

March 19, 2024

Transformers, parallel computation, and logarithmic depth

We show that a constant number of self-attention layers can efficientlysimulate, and be simulated by, a constant number of communication rounds ofMass...

February 14, 2024

Datasheet for the Pile

This datasheet describes the Pile, a 825 GiB dataset of human-authored textcompiled by EleutherAI for use in large-scale language modeling. The Pile i...

January 13, 2022

Benchmarking Vision Language Models for Cultural Understanding

Foundation models and vision-language pre-training have notably advancedVision Language Models (VLMs), enabling multimodal processing of visual andlin...

July 15, 2024

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

We introduce the Universal Speech Model (USM), a single large model thatperforms automatic speech recognition (ASR) across 100+ languages. This isachi...

March 2, 2023

STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians

Recent progress in pre-trained diffusion models and 3D generation havespurred interest in 4D content creation. However, achieving high-fidelity 4Dgene...

March 22, 2024

UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

Accurate monocular metric depth estimation (MMDE) is crucial to solvingdownstream tasks in 3D perception and modeling. However, the remarkableaccuracy...

February 27, 2025

A Comprehensive Overview of Large Language Models (LLMs) for Cyber Defences: Opportunities and Directions

The recent progression of Large Language Models (LLMs) has witnessed greatsuccess in the fields of data-centric applications. LLMs trained on massivet...

May 23, 2024

Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition

The development of multimodal models has significantly advanced multimodalsentiment analysis and emotion recognition. However, in real-worldapplicatio...

July 7, 2024

KAN 2.0: Kolmogorov-Arnold Networks Meet Science

A major challenge of AI + Science lies in their inherent incompatibility:today's AI is primarily based on connectionism, while science depends onsymbo...

August 19, 2024

A Survey on Hardware Accelerators for Large Language Models

Large Language Models (LLMs) have emerged as powerful tools for naturallanguage processing tasks, revolutionizing the field with their ability tounder...

January 18, 2024

A Comprehensive Survey on Kolmogorov Arnold Networks (KAN)

Through this comprehensive survey of Kolmogorov-Arnold Networks(KAN), we havegained a thorough understanding of its theoretical foundation, architectu...

July 13, 2024

Flow Matching Imitation Learning for Multi-Support Manipulation

Humanoid robots could benefit from using their upper bodies for supportcontacts, enhancing their workspace, stability, and ability to performcontact-r...

July 17, 2024

CRAG – Comprehensive RAG Benchmark

Retrieval-Augmented Generation (RAG) has recently emerged as a promisingsolution to alleviate Large Language Model (LLM)'s deficiency in lack ofknowle...

June 7, 2024

Unmasking and Quantifying Racial Bias of Large Language Models in Medical Report Generation

Large language models like GPT-3.5-turbo and GPT-4 hold promise forhealthcare professionals, but they may inadvertently inherit biases duringtheir tra...

January 25, 2024

The Unreasonable Effectiveness of Eccentric Automatic Prompts

Large Language Models (LLMs) have demonstrated remarkable problem-solving andbasic mathematics abilities. However, their efficacy is highly contingent...

February 9, 2024

InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

3D Gaussians have recently emerged as an efficient representation for novelview synthesis. This work studies its editability with a particular focus o...

April 17, 2024

Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models

The remarkable success of Large Language Models (LLMs) has ushered naturallanguage processing (NLP) research into a new era. Despite their diversecapa...

July 8, 2024

The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

Human feedback is central to the alignment of Large Language Models (LLMs).However, open questions remain about methods (how), domains (where), people...

April 24, 2024

OpenTab: Advancing Large Language Models as Open-domain Table Reasoners

Large Language Models (LLMs) trained on large volumes of data excel atvarious natural language tasks, but they cannot handle tasks requiringknowledge ...

February 22, 2024

Towards Explainable, Safe Autonomous Driving with Language Embeddings for Novelty Identification and Active Learning: Framework and Experimental Analysis with Real-World Data Sets

This research explores the integration of language embeddings for activelearning in autonomous driving datasets, with a focus on novelty detection.Nov...

February 11, 2024

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

CLIP, as a vision-language model, has significantly advanced Open-VocabularySemantic Segmentation (OVSS) with its zero-shot capabilities. Despite itss...

July 11, 2024

JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

Mathematical reasoning is an important capability of large languagemodels (LLMs) for real-world applications. To enhance this capability, existingwork...

May 23, 2024

Probing the Creativity of Large Language Models: Can models produce divergent semantic association?

Large language models possess remarkable capacity for processing language,but it remains unclear whether these models can further generate creativecon...

October 17, 2023

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Recent advances in Text-to-Video generation (T2V) have achieved remarkablesuccess in synthesizing high-quality general videos from textual description...

April 7, 2024

LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning

In recent years, instruction-tuned Large Multimodal Models (LMMs) have beensuccessful at several tasks, including image captioning and visual question...

June 17, 2024

PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

Vehicle motion planning is an essential component of autonomous drivingtechnology. Current rule-based vehicle motion planning methods performsatisfact...

June 3, 2024

AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning

Large-scale pretraining followed by task-specific finetuning has achievedgreat success in various NLP tasks. Since finetuning all parameters of largep...

March 14, 2024

Spiral of Silence: How is Large Language Model Killing Information Retrieval? – A Case Study on Open Domain Question Answering

The practice of Retrieval-Augmented Generation (RAG), which integrates LargeLanguage Models (LLMs) with retrieval systems, has become increasinglyprev...

April 16, 2024

NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data

Large Language Models (LLMs) have shown impressive abilities in dataannotation, opening the way for new approaches to solve classic NLP problems.In th...

February 23, 2024

CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

Domain generalization (DG) based Face Anti-Spoofing (FAS) aims to improve themodel's performance on unseen domains. Existing methods either rely on do...

March 21, 2024

A Tale of Tails: Model Collapse as a Change of Scaling Laws

As AI model size grows, neural scaling laws have become a crucial tool topredict the improvements of large models when increasing capacity and the siz...

February 10, 2024

Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

Vision Language Models (VLMs) demonstrate remarkable proficiency inaddressing a wide array of visual questions, which requires strong perceptionand re...

June 20, 2024

Enhancing Large Language Models for Text-to-Testcase Generation

Context: Test-driven development (TDD) is a widely employed softwaredevelopment practice that involves developing test cases based on requirementsprio...

February 19, 2024

SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Instruction-tuned Large Language Models (LLMs) have recently showcasedremarkable advancements in their ability to generate fitting responses tonatural...

January 30, 2024

Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks

This study investigates the loss of generalization ability in neuralnetworks, revisiting warm-starting experiments from Ash Adams. Our empiricalan...

June 1, 2024

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

We introduce Generalized Instruction Tuning (called GLAN), a general andscalable method for instruction tuning of Large Language Models (LLMs). Unlike...

February 20, 2024

Diffusion Models, Image Super-Resolution And Everything: A Survey

Diffusion Models (DMs) have disrupted the image Super-Resolution (SR) fieldand further closed the gap between image quality and human perceptualprefer...

January 1, 2024

Defending Large Language Models Against Jailbreak Attacks via Layer-specific Editing

Large language models (LLMs) are increasingly being adopted in a wide rangeof real-world applications. Despite their impressive performance, recentstu...

May 28, 2024

Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors

Test-time adaptation (TTA) fine-tunes pre-trained deep neural networks forunseen test data. The primary challenge of TTA is limited access to the enti...

March 12, 2024

Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives

The emergence of Generative Artificial Intelligence (AI) and Large LanguageModels (LLMs) has marked a new era of Natural Language Processing (NLP),int...

July 20, 2024

Thinking Tokens for Language Modeling

How much is 56 times 37? Language models often make mistakes in these typesof difficult calculations. This is usually explained by their inability top...

May 14, 2024

FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

Multimodal image fusion aims to integrate information from different imagingtechniques to produce a comprehensive, detail-rich single image for downst...

April 15, 2024

Low-Rank Few-Shot Adaptation of Vision-Language Models

Recent progress in the few-shot adaptation of Vision-Language Models (VLMs)has further pushed their generalization capabilities, at the expense of jus...

May 28, 2024

UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence

Garment manipulation (e.g., unfolding, folding and hanging clothes) isessential for future robots to accomplish home-assistant tasks, while highlychal...

May 11, 2024

Croissant: A Metadata Format for ML-Ready Datasets

Data is a critical resource for machine learning (ML), yet working with dataremains a key friction point. This paper introduces Croissant, a metadataf...

March 28, 2024

Curriculum reinforcement learning for quantum architecture search under hardware errors

The key challenge in the noisy intermediate-scale quantum era is findinguseful circuits compatible with current device limitations. Variational quantu...

February 5, 2024

Applications of Deep Neural Networks with Keras

Deep learning is a group of exciting new technologies for neural networks.Through a combination of advanced training techniques and neural networkarch...

September 11, 2020

A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

With the significant development of large models in recent years, LargeVision-Language Models (LVLMs) have demonstrated remarkable capabilities across...

July 10, 2024

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Language models demonstrate both quantitative improvement and new qualitativecapabilities with increasing scale. Despite their potentially transformat...

June 9, 2022

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

The advent of Large Language Models (LLMs) has brought an unprecedented surgein machine-generated text (MGT) across diverse channels. This raises legi...

February 17, 2024

Versatile Behavior Diffusion for Generalized Traffic Agent Simulation

Existing traffic simulation models often fail to capture the complexities ofreal-world scenarios, limiting the effective evaluation of autonomous driv...

April 3, 2024

COCONut: Modernizing COCO Segmentation

In recent decades, the vision community has witnessed remarkable progress invisual recognition, partially owing to advancements in dataset benchmarks....

April 12, 2024

Exploring the Potential of Large Language Models in Self-adaptive Systems

Large Language Models (LLMs), with their abilities in knowledge acquisitionand reasoning, can potentially enhance the various aspects of Self-adaptive...

January 15, 2024

Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing

In singing voice synthesis (SVS), generating singing voices from musicalscores faces challenges due to limited data availability. This study proposes ...

January 31, 2024

Decentralized Multi-Robot Navigation for Autonomous Surface Vehicles with Distributional Reinforcement Learning

Collision avoidance algorithms for Autonomous Surface Vehicles (ASV) thatfollow the Convention on the International Regulations for PreventingCollisio...

February 19, 2024

Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning

This research pioneers the use of fine-tuned Large Language Models (LLMs) toautomate Systematic Literature Reviews (SLRs), presenting a significant an...

April 8, 2024

Dynamic Prompt Optimizing for Text-to-Image Generation

Text-to-image generative models, specifically those based on diffusion modelslike Imagen and Stable Diffusion, have made substantial advancements. Rec...

April 5, 2024

What If We Recaption Billions of Web Images with LLaMA-3?

Web-crawled image-text pairs are inherently noisy. Prior studies demonstratethat semantically aligning and enriching textual descriptions of these pai...

June 12, 2024

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Large, high-capacity models trained on diverse datasets have shown remarkablesuccesses on efficiently tackling downstream applications. In domains fro...

October 13, 2023

Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

We introduce Bonito, an open-source model for conditional task generationthat converts unannotated text into task-specific training datasets forinstru...

February 28, 2024

EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage

Generalist web agents have demonstrated remarkable potential in autonomouslycompleting a wide range of tasks on real websites, significantly boosting ...

September 17, 2024

Top Leaderboard Ranking = Top Coding Proficiency, Always? EvoEval: Evolving Coding Benchmarks via LLM

LLMs have become the go-to choice for code generation tasks, with anexponential increase in the training, development, and usage of LLMsspecifically f...

March 28, 2024

TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos

We propose TRAM, a two-stage method to reconstruct a human's globaltrajectory and motion from in-the-wild videos. TRAM robustifies SLAM to recoverthe ...

March 26, 2024

A Survey on Kolmogorov-Arnold Network

This systematic review explores the theoretical foundations, evolution,applications, and future potential of Kolmogorov-Arnold Networks (KAN), aneural...

November 9, 2024

Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

Instruction-tuned Large Language Models (LLMs) show impressive results innumerous practical applications, but they lack essential safety features that...

March 11, 2024

DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation

Imitation learning from human hand motion data presents a promising avenuefor imbuing robots with human-like dexterity in real-world manipulation task...

March 12, 2024

Kolmogorov-Arnold Networks are Radial Basis Function Networks

This short paper is a fast proof-of-concept that the 3-order B-splines usedin Kolmogorov-Arnold Networks (KANs) can be well approximated by Gaussianra...

May 10, 2024

Spectral Networks and Locally Connected Networks on Graphs

Convolutional Neural Networks are extremely efficient architectures in imageand audio recognition tasks, thanks to their ability to exploit the localt...

December 21, 2013

DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

This paper presents a novel method for exerting fine-grained lighting controlduring text-driven diffusion-based image generation. While existing diffu...

February 19, 2024

Personalized Language Modeling from Personalized Human Feedback

Personalized large language models (LLMs) are designed to tailor responses toindividual user preferences. While Reinforcement Learning from Human Feed...

February 6, 2024

Research on Autonomous Robots Navigation based on Reinforcement Learning

Reinforcement learning continuously optimizes decision-making based onreal-time feedback reward signals through continuous interaction with theenviron...

July 2, 2024

Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation

The Open-MAGVIT2 project produces an open-source replication of Google'sMAGVIT-v2 tokenizer, a tokenizer with a super-large codebook (i.e., 2^18codes)...

September 6, 2024

Dense Reward for Free in Reinforcement Learning from Human Feedback

Reinforcement Learning from Human Feedback (RLHF) has been credited as thekey advance that has allowed Large Language Models (LLMs) to effectively fol...

February 1, 2024

Understanding Test-Time Augmentation

Test-Time Augmentation (TTA) is a very powerful heuristic that takesadvantage of data augmentation during testing to produce averaged output.Despite t...

February 10, 2024

3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation

This paper introduces a network for volumetric segmentation that learns fromsparsely annotated volumetric images. We outline two attractive use cases ...

June 21, 2016

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

We introduce MuAViC, a multilingual audio-visual corpus for robust speechrecognition and robust speech-to-text translation providing 1200 hours ofaudi...

March 1, 2023

Multi-Object Hallucination in Vision-Language Models

Large vision language models (LVLMs) often suffer from object hallucination,producing objects not present in the given images. While current benchmark...

July 8, 2024

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

This paper presents a study on strategies to enhance the translationcapabilities of large language models (LLMs) in the context of machinetranslation ...

March 18, 2024

MOMENT: A Family of Open Time-series Foundation Models

We introduce MOMENT, a family of open-source foundation models forgeneral-purpose time series analysis. Pre-training large models on time seriesdata i...

February 6, 2024

BASS: Batched Attention-optimized Speculative Sampling

Speculative decoding has emerged as a powerful method to improve latency andthroughput in hosting large language models. However, most existingimpleme...

April 24, 2024

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

Mathematical equations have been unreasonably effective in describing complexnatural phenomena across various scientific disciplines. However, discove...

April 29, 2024

Economics

No articles found.

Electrical Engineering and Systems Science

VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis

Self-Supervised Learning (SSL) has demonstrated promising results in 3Dmedical image analysis. However, the lack of high-level semantics inpre-trainin...

February 27, 2024

Benchmarking foundation models as feature extractors for weakly-supervised computational pathology

Advancements in artificial intelligence have driven the development ofnumerous pathology foundation models capable of extracting clinically relevantin...

August 28, 2024

ECGformer: Leveraging transformer for ECG heartbeat arrhythmia classification

An arrhythmia, also known as a dysrhythmia, refers to an irregular heartbeat.There are various types of arrhythmias that can originate from different ...

January 6, 2024

Mathematics

High-fidelity single-spin shuttling in silicon

The computational power and fault-tolerance of future large-scale quantumprocessors derive in large part from the connectivity between the qubits. One...

June 11, 2024

Exact Thermal Eigenstates of Nonintegrable Spin Chains at Infinite Temperature

The eigenstate thermalization hypothesis (ETH) plays a major role inexplaining thermalization of isolated quantum many-body systems. However, therehas...

March 19, 2024

Physics

Precise test of lepton flavour universality in W-boson decays into muons and electrons in pp collisions at s√=13 TeV with the ATLAS detector

The ratio of branching ratios of the W boson to muons and electrons, Rμ/eW=B(W→μν)/B(W→eν), has been measured using 140 fb−1 of pp collision data at s...

March 4, 2024

Spin-polarized Specular Andreev Reflections in Altermagnets

We show theoretically that specular Andreev reflection occurs stably ataltermagnet–superconductor interfaces, which is a phenomenon that haspreviously...

March 11, 2024

Parametric multi-element coupling architecture for coherent and dissipative control of superconducting qubits

As systems for quantum computing keep growing in size and number of qubits,challenges in scaling the control capabilities are becoming increasinglyrel...

March 4, 2024

The Quantum Internet

Quantum networks offer a unifying set of opportunities and challenges acrossexciting intellectual and technical frontiers, including for quantumcomput...

June 25, 2008

Hardware-efficient quantum error correction via concatenated bosonic qubits

In order to solve problems of practical importance, quantum computers willlikely need to incorporate quantum error correction, where a logical qubit i...

September 19, 2024

Nonreciprocal Quantum Batteries

Nonreciprocity, arising from the breaking of time-reversal symmetry, hasbecome a fundamental tool in diverse quantum technology applications. Itenable...

January 10, 2024

High-fidelity single-spin shuttling in silicon

The computational power and fault-tolerance of future large-scale quantumprocessors derive in large part from the connectivity between the qubits. One...

June 11, 2024

Iterative assembly of ^171Yb atom arrays with cavity-enhanced optical lattices

Assembling and maintaining large arrays of individually addressable atoms isa key requirement for continued scaling of neutral-atom-based quantum comp...

January 29, 2024

Quantum Melting of a Disordered Wigner Solid

The behavior of two-dimensional electron gas (2DEG) in extreme couplinglimits are reasonably well-understood, but our understanding of intermediatereg...

February 8, 2024

Solving the strong CP problem without axions

We formulate general conditions under which the strong CP problem is solvedby spontaneous CP violation. Quark-mass matrix elements are polynomials in ...

June 3, 2024

DESI 2024 IV: Baryon Acoustic Oscillations from the Lyman Alpha Forest

We present the measurement of Baryon Acoustic Oscillations (BAO) from theLyman-α (Lyα) forest of high-redshift quasars with the first-yeardataset of t...

April 3, 2024

Gravitational entropy is observer-dependent

In quantum gravity, it has been argued that a proper accounting of the roleplayed by an observer promotes the von Neumann algebra of observables in ag...

April 30, 2024

How to factor 2048 bit RSA integers in 8 hours using 20 million noisy qubits

We significantly reduce the cost of factoring integers and computing discretelogarithms in finite fields on a quantum computer by combining techniques...

May 23, 2019

A Review of Gravitational Memory and BMS Frame Fixing in Numerical Relativity

Gravitational memory effects and the BMS freedoms exhibited at future nullinfinity have recently been resolved and utilized in numerical relativitysim...

May 14, 2024

The Sonora Substellar Atmosphere Models. IV. Elf Owl: Atmospheric Mixing and Chemical Disequilibrium with Varying Metallicity and C/O Ratios

Disequilibrium chemistry due to vertical mixing in the atmospheres of manybrown dwarfs and giant exoplanets is well-established. Atmosphere models for...

February 1, 2024

Distinguishing oceans of water from magma on mini-Neptune K2-18b

Mildly irradiated mini-Neptunes have densities potentially consistent withthem hosting substantial liquid water oceans (`Hycean' planets). The presenc...

January 11, 2024

Krylov complexity of density matrix operators

Quantifying complexity in quantum systems has witnessed a surge of interestin recent years, with Krylov-based measures such as Krylov complexity (C_K)...

February 14, 2024

Quantitative-Biology

Enhancing the efficiency of protein language models with minimal wet-lab data through few-shot learning

Accurately modeling the protein fitness landscapes holds great importance forprotein engineering. Recently, due to their capacity and representationab...

February 3, 2024

Out of Many, One: Designing and Scaffolding Proteins at the Scale of the Structural Universe with Genie 2

Protein diffusion models have emerged as a promising approach for proteindesign. One such pioneering model is Genie, a method that asymmetricallyrepre...

May 24, 2024

Quantitative-Finance

No articles found.

How to

Personalized Wireless Federated Learning for Large Language Models

Scholar Articles

Computer Science#

Economics#

Computer Science

Economics