AI-generated texts have a wide range of use cases across various industries and domains such as; Content Generation, Personalized Marketing, Virtual Assistants and Chatbots:, Creative Writing and Storytelling, and more. While AI-generated texts offer many benefits and applications, there are also some inherent risks that need to be considered.
1. Misinformation and Fake News: AI-generated texts can be used to spread misinformation or generate fake news articles that appear authentic. This poses a significant risk to society as false information can quickly spread, leading to confusion, manipulation, and the erosion of trust in media and communication channels.
2. Bias and Discrimination: AI models are trained on vast amounts of data, which can reflect biases present in society. If not properly identified and addressed, AI-generated texts can perpetuate and amplify existing biases, including racial, gender, or social biases. This can lead to biased recommendations, unfair decision-making processes, or discriminatory content.
3. Lack of Accountability: As AI systems generate texts autonomously, it becomes challenging to hold someone accountable for the content produced. The absence of clear guidelines or oversight can allow malicious actors to exploit AI-generated texts for unethical purposes or illegal activities.
4. Restricted Privacy: AI models are built using large amounts of information, often the personal data of users. As the technology's conversational ability improves, the inability to distinguish between human and machine puts the human user at risk of having their personal information leaked and used for unethical or malicious purposes i.e. sales marketing data, deepfakes, etc.
bitgrit wants to tackle these complex issues by identifying the AI-generated texts using the machine learning algorithm. Your task is to develop an algorithm to classify AI-generated text from human-generated texts.
- 1st Prize: $1,500
- 2nd Prize: $1,000
- 3rd Prize: $500
- Competition Starts: 1 November 2023
- Competition Ends: 31 January 2024
- Winners Announced (Subject to change based on submission results): 20 February 2024
The goal of this competition is to predict whether each text is generated by AI or not, which is indicated by "ind" column (1: Generated by AI 0: Generated by humans) by using the following information:
feature_0 ~ feature_767: word embeddings of the sentence.
word_count: the number of words in the sentence.
punc_num: the number of punctuations present in the sentence.
Downloadable file "ai-generated-text-data.zip" includes the following files:
1. training_set.csv: file to train your machine learning model.
2. test_set.csv: file that can be used to test how well your model performs on unseen data. This is the file you're going to make predictions on with your trained model and create a submission file.
3. solution_format.csv: example of the format that the submission file needs to be in to be properly scored
*The submission file should follow the same format as the example file (solution_format.csv). The submission file has 2 columns, one for id and one for value. BOTH have to be passed as STRINGS, which means that both columns should include the corresponding double quote marks (" "). If these are numeric values (no quote marks), the score will turn to 0.
**Submissions are evaluated by F1 Score.
***Final competition results are based on the Private Leaderboard results, and the winner will be the user at the top of the Private Leaderboard.
1. This competition is governed by the following Terms of Participation. Participants must agree to and comply with these Terms to participate.
2. Users can make a maximum number of 3 submissions per day. If users want to submit new files after making 3 submissions in a day, they will have to wait until the following day to do so. Please keep this in mind when uploading a submission.csv file. Any attempt to circumvent stated limits will result in disqualification.
3. The use of external datasets is not allowed.
4. It is not allowed to upload the competition dataset to other websites. Users who do not comply with this rule will be disqualified.
5. A competition prize will be awarded after we have received, successfully executed, and confirmed the validity of both the code and the solution (See 6.). Once winners are announced and our team reaches out to them, the winners must provide the following by February 15, 2024 to be qualified as a competition winner and receive their prize:
a. All source files required to preprocess the data
b. All source files required to build, train and make predictions with the model using the processed data
c. A requirements.txt (or equivalent) file indicating all the required libraries and their versions as needed
d. A ReadMe file containing the following:
• Clear and unambiguous instructions on how to reproduce the predictions from start to finish including data pre-processing, feature extraction, model training, and predictions generation
• Environment details regarding where the model was developed and trained, including OS, memory (RAM), disk space, CPU/GPU used, and any required environment configurations required to execute the code
• Clear answers to the following questions:
- Which data files are being used?
- How are these files processed?
- What is the algorithm used and what are its main hyperparameters?
- Any other comments considered relevant to understanding and using the model
6. The submitted solution should be able to generate exactly the same output that gives the corresponding score on the leaderboard. If the score obtained from the code is different from what’s shown on the leaderboard, the new score will be used for the final rankings unless a logical explanation is provided. Please make sure to set the seed or random_state etc. so we can obtain the same result from your code.
7. The final submission has to be selected manually before the end of the competition (you can select up to 2), or else it will be selected automatically based on your highest public score.
8. In order to be eligible for the prize, the competition winner must agree to transfer to the Host and the relevant transferee of rights in such Competition all transferable rights, such as copyrights, rights to obtain patents and know-how, etc. in and to all analysis and prediction results, reports, analysis and prediction model, algorithm, source code and documentation for the model reproducibility, etc., and the Submissions contained in the Final Submissions.
9. Any prize awards are subject to eligibility verification and compliance with these Terms of Participation. All decisions of bitgrit will be final and binding on all matters relating to this Competition.
10. Payments to winners may be subject to local, state, federal and foreign tax reporting and withholding requirements.
11. If two or more participants have the same score on the leaderboard, the participant who submitted the winning file first will be considered the winner.
12. All submissions must be made individually; no teams are allowed in this competition. Users who do not comply with this rule will be immediately disqualified in the case that we find the same or very similar scores and/or uploaded solutions.
13. If you have any inquiries about this competition, please don’t hesitate to reach out to us at email@example.com.
Non-Disclosure Agreement (NDA)
An agreement to not reveal the information shared regarding this competition to others.
- This Non-Disclosure Agreement (“Agreement”) is hereby entered into on 9th December 2023 (“Effective Date”) between you (“Participant”), as a participant in the AI Generated Text Classification Challenge (the “Competition”) hosted at bitgrit.net (the “Competition Site”), and bitgrit Inc. (“Bitgrit”).
- Purpose: This Agreement aims to protect information disclosed by Bitgrit to Participant (the “Purpose”).
- Confidential Information: (1) Confidential Information shall mean any and all information disclosed by Bitgrit to the Participant with regard to the entry and participation in the Competition, including (i) metadata, source code, object code, firmware etc. and, in addition to these, (ii) analytes, compilations or any other deliverable produced by the Participant in which such disclosed information is utilized or reflected. (2) Confidential Information shall not include information which; (a) is now or hereafter becomes, through no act or omission on the Participant, generally known or available to the public, or, in the present or into the future, enters the public domain through no act or omission by the Participant; (b) is acquired by the Participant before receiving such information from Bitgrit and such acquisition was without restriction as to the use or disclosure of the same; (c) is hereafter rightfully furnished to the participant by a third party, without restriction as to use or disclosure of the same.
- Non-Disclosure Obligation: The Participant agrees: (a) to hold Confidential Information in strict confidence; (b) to exercise at least the same care in protecting Confidential Information from disclosure as the party uses with regard to its own confidential information; (c) not use any Confidential Information except for as it concerns the Purpose elaborated upon above; (d) not disclose such Confidential Information to third parties; (e) to inform Bitgrit if it becomes aware of an unauthorized disclosure of Confidential Information.
- No Warranty: All Confidential Information is provided “as is.” None of the Confidential Information shall contain any representation, warranty, assurance, or integrity by Bitgrit to the Participant of any kind.
- No Granting of Rights: The Participant agrees that nothing contained in this Agreement shall be construed as conferring, transferring or granting any rights to the Participant, by license or otherwise, to use any of the Confidential Information.
- No Assignment: Participant shall not assign, transfer or otherwise dispose of this Agreement or any of its rights, interest or obligations hereunder without the prior written consent of Bitgrit.
- Injunctive Relief: In the event of a breach or the possibility of breach of this Agreement by the Participant, in addition to any remedies otherwise available, Bitgrit shall be entitled to seek injunctive relief or equitable relief, as well as monetary damages.
- Return/Destruction of the Confidential Information: (1) On the request of Bitgrit, the Participant shall promptly, in a manner specified by Bitgrit, return or destroy the Confidential Information along with any copies of said information. (2) Bitgrit may request the Participant to submit documentation to confirm the destruction of said Confidential Information to Bitgrit in the event that Bitgrit requests the Participant to destroy this Confidential Information, pursuant to the provision of the preceding paragraph.
- Term: The obligations with respect to the Confidential Information under this Agreement shall survive for a period of three (3) years after the effective date. Provided however, if the Confidential Information could be considered to fall under the category of “Trade Secret” of Bitgrit or any related third parties, this Agreement is to remain effective relative to that information for as far as the said information is regarded as Trade Secret under applicable laws and regulations. If the Confidential Information contains personal information, the terms of this Agreement shall remain effective on that information permanently.
- Governing Law: This Agreement shall be governed by and construed and interpreted under the laws of Japan without reference to its principles governing conflicts of laws.
Terms & Conditions
Please login to access this page
Join our newsletter
bitgrit will be your one stop shop for all
your AI solution needs
- Japan Office
- +81 3 6671 8256
Koganei Building 4th Floor,
Meguro City, Tokyo, Japan
- UAE Office
Office 103, Level 14, Al Sarab Tower, Abu Dhabi Global Market Square, Al Maryah Island,
Abu Dhabi, UAE