bert-dp-second

This model is a fine-tuned version of on the generator dataset. It achieves the following results on the evaluation set:

Loss: 3.2321

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 1000
num_epochs: 19
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
7.3416	0.23	500	6.6532
6.5752	0.47	1000	6.5275
6.4866	0.7	1500	6.4720
6.4273	0.93	2000	6.4540
6.4036	1.17	2500	6.4236
6.3779	1.4	3000	6.4018
6.3528	1.63	3500	6.3768
6.3258	1.87	4000	6.3679
6.3009	2.1	4500	6.3305
6.2646	2.33	5000	6.3142
6.2583	2.57	5500	6.3004
6.2223	2.8	6000	6.2605
6.1941	3.03	6500	6.2353
6.1382	3.27	7000	6.2095
6.1301	3.5	7500	6.1774
6.09	3.73	8000	6.1480
6.0624	3.97	8500	6.1061
6.0056	4.2	9000	6.0655
5.9444	4.43	9500	5.9461
5.7101	4.67	10000	5.2594
5.005	4.9	10500	4.7348
4.6127	5.13	11000	4.4626
4.3907	5.37	11500	4.2862
4.241	5.6	12000	4.1701
4.1286	5.83	12500	4.0673
4.0151	6.07	13000	3.9967
3.934	6.3	13500	3.9292
3.8789	6.53	14000	3.8707
3.8231	6.77	14500	3.8222
3.7696	7.0	15000	3.7800
3.7078	7.23	15500	3.7424
3.6671	7.47	16000	3.7093
3.6446	7.7	16500	3.6780
3.6069	7.93	17000	3.6476
3.5782	8.17	17500	3.6283
3.5384	8.4	18000	3.6098
3.5245	8.63	18500	3.5942
3.5209	8.87	19000	3.5841
3.4948	9.1	19500	3.5728
3.4877	9.33	20000	3.5692
3.4818	9.57	20500	3.5641
3.4844	9.8	21000	3.5640
3.5323	10.03	21500	3.6026
3.5123	10.27	22000	3.5877
3.5046	10.5	22500	3.5595
3.4787	10.73	23000	3.5403
3.4568	10.97	23500	3.5125
3.4154	11.2	24000	3.4916
3.3998	11.43	24500	3.4749
3.3986	11.67	25000	3.4578
3.372	11.9	25500	3.4405
3.3402	12.13	26000	3.4317
3.3281	12.37	26500	3.4215
3.322	12.6	27000	3.4093
3.3198	12.83	27500	3.4026
3.3039	13.07	28000	3.3971
3.296	13.3	28500	3.3954
3.3015	13.53	29000	3.3954
3.2939	13.77	29500	3.3927
3.3013	14.0	30000	3.3918
3.343	14.23	30500	3.4265
3.3438	14.47	31000	3.4133
3.3397	14.7	31500	3.3951
3.3156	14.93	32000	3.3681
3.2815	15.17	32500	3.3503
3.2654	15.4	33000	3.3313
3.2492	15.63	33500	3.3184
3.2399	15.87	34000	3.2995
3.2222	16.1	34500	3.2922
3.2026	16.33	35000	3.2818
3.191	16.57	35500	3.2723
3.1825	16.8	36000	3.2640
3.1691	17.03	36500	3.2530
3.1656	17.27	37000	3.2487
3.1487	17.5	37500	3.2419
3.1635	17.73	38000	3.2411
3.1675	17.97	38500	3.2330
3.1422	18.2	39000	3.2344
3.1443	18.43	39500	3.2331
3.1425	18.67	40000	3.2348
3.139	18.9	40500	3.2321

Framework versions

Transformers 4.26.1
Pytorch 1.11.0+cu113
Datasets 2.13.0
Tokenizers 0.13.3

Downloads last month: 5

Evaluation results

Metadata error: specify a dataset to view leaderboard