Running Tablular Cleaning Baselines
Table of Contents
-
-
Raha
https://github.com/BigDaMa/raha
PClean
1 | |
Change Current Directory.
1 | |
Activate the project and install dependencies.
1 | |
run command
1 | |
Cocoon
Running the Cocoon tutorial
https://github.com/Cocoon-Data-Transformation/cocoon
Garf
https://github.com/PJinfeng/Garf-master
Garf 需要使用Oracle数据库。
HoloClean
Git: https://github.com/HoloClean/holoclean
Relate Info
System Version:
1 | |
0. Pre
1 | |
1. Download code
1 | |
2. Create Conda Env And Enter
1 | |
3. Install dependency
1 | |
4. Test
1 | |
Question:
Q1:
Error in Insatll python-Levenshtein:
- Ubuntu/Debian System:
1
sudo apt-get install build-essential python3-dev - CentOS/Fedora System:
1
2yum groupinstall 'Development Tools'
yum install python3-dev
Q2:
Error Message:
1 | |
Cause:
After Python 3.7, has future annotations
So, degrade the version of smart_open
1 | |
https://github.com/delgaudl/RTClean
RTClean
Download
1 | |
1 | |
1. Create conda env
1 | |
2. Modify requirements.txt
1 | |
If you are using proxy, you may need set:
1 | |
1 | |
3. Modify Code
Error Message:
1 | |
Replace all time.clock() to time.time()
4. Test holoclean
examples/holoclean_repair_example.py
5. Install extra requirements
1 | |
Running Tablular Cleaning Baselines
https://www.hardyhu.cn/2025/02/24/Running-Tablular-Cleaning-Baselines/