宇健何 posted an update — Aug 15, 2020 12:47 PM EDT

Updates（2020.08.16）

This update is mainly about miscellaneous fixes, but I've also introduced a toy example to reveal the power of carefree-learn - the famous Titanic competition!

Here are the source codes:

import os
import cflearn

from cfdata.tabular import *

file_folder = os.path.dirname(__file__)


def test():
    train_file = os.path.join(file_folder, "train.csv")
    test_file = os.path.join(file_folder, "test.csv")
    data_config = {"label_name": "Survived"}
    hpo = cflearn.tune_with(
        train_file,
        model="tree_dnn",
        temp_folder="__hpo__",
        task_type=TaskTypes.CLASSIFICATION,
        data_config=data_config,
        num_parallel=0
    )
    results = cflearn.repeat_with(
        train_file,
        **hpo.best_param,
        models="tree_dnn",
        temp_folder="__repeat__",
        num_repeat=10, num_jobs=0,
        data_config=data_config
    )
    ensemble = cflearn.EnsemblePattern(results.patterns["tree_dnn"])
    predictions = ensemble.predict(test_file).ravel()
    x_te, _ = results.transformer.data.read_file(test_file, contains_labels=False)
    id_list = DataTuple.with_transpose(x_te, None).xT[0]
    # Score : achieved ~0.79
    with open("submissions.csv", "w") as f:
        f.write("PassengerId,Survived\n")
        for test_id, prediction in zip(id_list, predictions):
            f.write(f"{test_id},{prediction}\n")


if __name__ == '__main__':
    test()

As you can see, carefree-learn doesn't need explicit data-preprocessing - it can take files as inputs and predict with files directly! More over, some common practises, such as hyper parameter tuning (cflearn.tune_with) and ensembling (cflearn.repeat_with & cflearn.EnsemblePattern), can be completed in a few lines of codes. These APIs also hide some other common practises (such as cross validation) under the hood, so the final performance is quite promising (I can achieve ~0.79 and the best one achieved 0.81+, which is almost the SOTA performance among other (more complicated) neural network solutions 1 2 3 4).

Log in or sign up for Devpost to join the conversation.