Abstract
Language models, such as GPT-3.5 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks. However, when probing language models using a range of basic table-understanding tasks, we observe that today's language models are still sub-optimal in many table-related tasks, likely because they are pre-trained predominantly on one-dimensional natural-language texts, whereas relational tables are two-dimensional objects. In this work, we propose a new "table-tuning" paradigm, where we continue to train/fine-tune language models like GPT-3.5 and ChatGPT, using diverse table-tasks synthesized from real tables as training data, with the goal of enhancing language models' ability to understand tables and perform table tasks. We show that our resulting Table-GPT models demonstrate (1) better table-understanding capabilities, by consistently outperforming the vanilla GPT-3.5 and ChatGPT, on a wide-range of table tasks, including holdout unseen tasks, and (2) strong generalizability, in its ability to respond to diverse human instructions to perform new table-tasks, in a manner similar to GPT-3.5 and ChatGPT.
Community
Can you opensource your training datasets of the different table tasks?
Write a top 5points
start discussing about this paper
Wtf is going on in these comments lol? Anyway, here's my summary...
Tables are everywhere - reports, databases, webpages. They neatly organize data for humans to parse. But despite strong language skills, AI still struggles with table comprehension.
Even models like GPT-3 fail at basic tasks like finding where a missing value should go. This is because they're trained mostly on free-flowing text, not 2D tabular data. Unlike unstructured text, data in tables derives meaning from its structure and position!
So researchers at Microsoft tried "table-tuning" - extending training with synthesized table task cases. Tasks like "impute missing value X" or "identify outliers in this table". They did this using a corpus of real-world tables.
They also augmented the data more by paraphrasing, reordering rows/columns, chaining model responses, and more. This protects against overfitting.
The resulting Table-GPT models showed big improvements:
- 25%+ better at unseen table tasks like missing value ID
- Beat GPT-3 on 98% of test cases over 9 different table tasks
- Stayed strong even after targeted downstream tuning
Table-tuning seems a promising step toward AI that can handle tables. That would unlock automated analysis over the troves of valuable tabular data out there.
TLDR: Training models on a large and diverse dataset of synthesized table tasks significantly boosts their table skills.
Can you opensource your training datasets of the different table tasks?
Not sure about that but here's a table dataset that is more than 800 billion tokens!
https://huggingface.co/datasets/approximatelabs/tablib-v1-full
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Testing the Limits of Unified Sequence to Sequence LLM Pretraining on Diverse Table Data Tasks (2023)
- LLM-augmented Preference Learning from Natural Language (2023)
- Efficient Finetuning Large Language Models For Vietnamese Chatbot (2023)
- A Systematic Evaluation of Large Language Models on Out-of-Distribution Logical Reasoning Tasks (2023)
- Training Generative Question-Answering on Synthetic Data Obtained from an Instruct-tuned Model (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
I am still not very sure with GPT capabilities of tackling arithmetical calculation in any context let it be tables or time series data. Till now the output that I have seen is not even close.
Key conclusion from my review of v1 of this paper (published on 13th Oct 2023):
This paper offers good introduction to simple table-tuning tasks, however task T-3 (TQA) should be significantly improved before Table-GPT can be used commercially.
Key points:
• Generic overview of results indicates very good results of table-tuning, however in my opinion tasks should be divided based on complexity to better understand value of table-tuning. Please see diagram below.
• For the most of easy tasks (all tasks except T-3) table-tuning offers great improvements in zero-shot comparing to vanilla models (209% improvement for GPT-3.5 and 119% for ChatGPT)
• For the most of easy tasks (all tasks except T-3) table-tuning offers good results in few-shot comparing to vanilla models (22% improvement for GPT-3.5 and 12% for ChatGPT)
• T-3 TQA is the most complex task (and with biggest business demand) and for this tasks table-tuning offers very small improvements (1-2% for ChatGPT and 5-8% for GPT-3.5), which is probably not worth of the fine-tuning effort
Open questions:
• Do you have plans to fine-tune GPT-4?
• Can you share recommendations on improving T-3 (TQA)? Maybe including TQA tasks in training?
• Can you include as well T-12 (NS) in tests?
• Can you specify number of tokens used (both for training and test execution) for each task
Other remarks:
• Markdown format increases performance of table-tuning by 3% comparing to CSV and by 5% comparing to JSON (table 5)
• For most of the tasks few-shot offers strong improvement over zero-shot for vanilla GPT 3.5 and ChatGPT (even without table-tuning).
• Typos found in paper:
- p.4 is “toke-by-token” should be “token-by-token”
- p.6 is “few select table-tasks” should be “few selected table-tasks”
- p.7 is “describes the row-augmentation task” should be “describes the column-augmentation task”
Revolutionizing AI: Table-GPT Enhances Language Models for Complex Table Tasks!
Links đź”—:
👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper