InternLM

Commit Graph

Select branches

Hide Pull Requests

add_xcomposer_testcase

internlm2-reward

main

new_model_release

#1

#10

#100

#106

#111

#114

#116

#120

#121

#123

#124

#125

#126

#128

#132

#133

#136

#138

#139

#14

#140

#141

#142

#143

#144

#145

#147

#151

#152

#153

#154

#155

#156

#158

#159

#160

#161

#163

#165

#166

#170

#174

#175

#176

#178

#180

#182

#183

#184

#185

#188

#189

#190

#192

#193

#194

#195

#196

#197

#198

#199

#200

#201

#203

#204

#206

#207

#208

#210

#211

#212

#213

#214

#216

#217

#218

#219

#220

#222

#223

#224

#225

#226

#227

#228

#229

#23

#230

#231

#233

#238

#239

#24

#240

#242

#243

#245

#246

#247

#249

#25

#250

#251

#252

#254

#255

#256

#257

#259

#260

#261

#262

#263

#264

#265

#266

#27

#271

#272

#273

#274

#275

#276

#277

#279

#281

#282

#284

#285

#286

#289

#292

#293

#294

#295

#297

#298

#299

#30

#300

#301

#302

#303

#304

#306

#307

#308

#309

#310

#311

#312

#313

#314

#315

#317

#319

#32

#320

#321

#322

#324

#325

#326

#328

#329

#330

#338

#339

#34

#342

#345

#352

#354

#359

#36

#360

#362

#367

#373

#374

#375

#376

#378

#379

#38

#380

#382

#383

#39

#391

#392

#393

#396

#397

#398

#4

#40

#403

#404

#405

#407

#408

#411

#412

#413

#417

#418

#419

#420

#421

#422

#423

#424

#426

#427

#428

#429

#43

#436

#439

#44

#440

#443

#448

#449

#450

#451

#452

#453

#454

#455

#456

#460

#462

#464

#465

#466

#467

#468

#470

#471

#474

#475

#476

#477

#478

#479

#48

#480

#481

#484

#485

#486

#488

#489

#490

#491

#492

#493

#495

#496

#498

#499

#5

#502

#504

#506

#507

#51

#510

#512

#513

#514

#515

#516

#517

#518

#519

#52

#520

#521

#522

#523

#524

#526

#527

#528

#529

#530

#531

#532

#533

#534

#536

#537

#538

#539

#540

#541

#542

#543

#544

#545

#546

#547

#548

#549

#550

#551

#554

#555

#557

#560

#562

#563

#564

#565

#566

#567

#569

#57

#570

#577

#580

#582

#583

#583

#585

#586

#587

#588

#589

#59

#590

#591

#593

#594

#595

#596

#597

#598

#599

#6

#601

#605

#61

#611

#613

#616

#619

#620

#621

#622

#624

#625

#626

#627

#629

#63

#632

#633

#634

#635

#636

#637

#638

#640

#641

#65

#650

#651

#652

#658

#663

#666

#667

#67

#674

#689

#692

#695

#7

#703

#705

#710

#717

#72

#721

#727

#73

#730

#732

#735

#740

#744

#745

#746

#751

#752

#753

#754

#755

#764

#767

#769

#77

#78

#780

#786

#795

#80

#802

#803

#805

#806

#814

#816

#817

#818

#819

#820

#821

#822

#824

#827

#835

#838

#89

#9

#91

#97

#98

#99

v0.1.0

v0.2.0

v0.2.1dev20230901

v0.2.1dev20230908

v0.2.1dev20230909

v0.2.1dev20230915

v0.2.1dev20231121

v0.2.1dev20240102

d9262da635 update hf model: add rope config and add qkv x54-729 2023-12-12 12:19:20 +0800
d6eeacfeb2 bug lijiaxing 2023-12-12 10:36:04 +0800
cc5b15349d

fix(metric): add metric dtype control (#533) Pryest 2023-12-11 19:36:31 +0800
fdce50a000 fix default behavior Pryest 2023-12-11 17:27:09 +0800
c7db6db066 fix the bug so that the sequence parallel norm is all-reduced when overlap is False yingtongxiong 2023-12-11 17:36:33 +0800
347370a58a fix demo config to avoid implicity Pryest 2023-12-11 16:25:33 +0800
6c0ff4820f

feat(model): support llama model with checkpoint loading (#532) jiaxingli 2023-12-11 16:25:24 +0800
649af64c59 fix(metric): add metric dtype control Pryest 2023-12-11 16:16:21 +0800
b63b8e58bd modeling lijiaxing 2023-12-11 15:43:59 +0800
a83b02acf4 modeling lijiaxing 2023-12-11 15:31:46 +0800
e57ca246d9 importerror lijiaxing 2023-12-11 13:53:48 +0800
472671688f importerror lijiaxing 2023-12-11 13:38:37 +0800
4b7fa26d80 support hf llama lijiaxing 2023-12-08 20:13:34 +0800
41edd074a6 support hf llama lijiaxing 2023-12-08 16:43:56 +0800
6def66fb07 support hf llama lijiaxing 2023-12-08 16:08:15 +0800
9d824d66ec support hf llama lijiaxing 2023-12-08 12:43:16 +0800
5c0925cd6c feat(metrics): make float32 logits off by default 877825076@qq.com 2023-12-08 00:46:53 +0800
66e4a8a847 auto resume lijiaxing 2023-12-07 19:42:03 +0800
81ffb3d824

fix(test): fix type_ids unpack bug (#530) Guoteng 2023-12-07 18:47:19 +0800
bbc1a01fe5 fix: update ci for type_ids unpack bug fix 877825076@qq.com 2023-12-07 13:17:02 +0800
68159c22a4 auto resume lijiaxing 2023-12-07 10:24:52 +0800
3f49409681 Merge branch 'develop' of https://github.com/InternLM/InternLM into storage_multipart_upload lijiaxing 2023-12-07 10:23:05 +0800
1da080a58e auto resume lijiaxing 2023-12-07 10:19:48 +0800
828033aed5

fix(storage): unify the name of ak & sk (#527) jiaxingli 2023-12-06 15:31:44 +0800
d0d39fa3ef storage lijiaxing 2023-12-06 15:24:57 +0800
ff62cf2a7c storage lijiaxing 2023-12-06 15:15:54 +0800
809ad9ebc8

fix the type_ids when micro_num=1 and use_flash_attn=False (#516) ytxiong 2023-12-06 14:38:28 +0800
112c34ae09

feat(grad_norm): vocab grad norm profiling (#519) jiaopenglong 2023-12-06 13:52:42 +0800
9fc252f40e

add output embedding tf32 option (#523) jiaopenglong 2023-12-06 13:50:59 +0800
c581cc4c02

fix(model): add IS_SEQUENCE_PARALLEL check for norm module (#528) ytxiong 2023-12-06 12:06:22 +0800
16f8ec2354 fix the spell bug and move the sequence judge to training_internlm yingtongxiong 2023-12-06 12:03:23 +0800
bffb515d30 fix lint yingtongxiong 2023-12-06 11:03:10 +0800
a9d5ad1b5f replace the named_children by named_modules yingtongxiong 2023-12-06 11:01:07 +0800
2b28923949 remove comments yingtongxiong 2023-12-06 10:35:40 +0800
e6c0d7bf62 fix lint yingtongxiong 2023-12-05 21:03:00 +0800
62d193c763 add IS_SEQUENCE_PARALLEL check for norm module yingtongxiong 2023-12-05 20:58:26 +0800
a34c31c08e change ak sk name lijiaxing 2023-12-05 17:44:56 +0800
c3a636ba0c change ak sk name lijiaxing 2023-12-05 15:07:52 +0800
3410362f4c change ak sk name lijiaxing 2023-12-05 13:57:05 +0800
5c2c247e21 Merge branch 'storage_multipart_upload' of https://github.com/li126com/InternLM into storage_multipart_upload lijiaxing 2023-12-05 12:21:30 +0800
4128f1dbe6 change ak sk name lijiaxing 2023-12-05 12:15:33 +0800
72cb7d6869 fix ci test_pipeline JiaoPL 2023-12-05 12:10:41 +0800
5b101f2377 Merge branch 'develop' of https://github.com/InternLM/InternLM into storage_multipart_upload lijiaxing 2023-12-05 11:56:04 +0800
843653de05 Merge branch 'develop' into feat/vocab_grad_norm JiaoPL 2023-12-05 11:53:14 +0800
2dbbab7418

fix test_checkpoint (#526) jiaxingli 2023-12-04 15:38:13 +0800
3b322618a4 fix test_checkpoint lijiaxing 2023-12-04 15:08:57 +0800
1738bee002

feat(storage): use multipart upload when using oss (#520) jiaxingli 2023-12-01 17:05:58 +0800
66bffffe5c

add unit test case (#524) kkscilife 2023-12-01 16:12:39 +0800
66d6efd004 overlap gating further Wenwen Qu 2023-11-23 17:46:32 +0800
d74ad7cca7 change assert condition for tutel Wenwen Qu 2023-11-17 18:58:52 +0800
d20aa41d86 implement overlap moe forward Wenwen Qu 2023-11-16 19:43:47 +0800
3443ab1f5b merge operand if noisy_gate_policy is not used Wenwen Qu 2023-11-28 16:17:49 +0800
95263fa1d0 merge operands in topk gating Wenwen Qu 2023-11-28 14:52:50 +0800
0b6a75c334 add unit test case wangmengke 2023-12-01 14:54:41 +0800
cdb8cfc929

Merge branch 'develop' into storage_multipart_upload jiaxingli 2023-12-01 14:17:37 +0800
c7e83fd611 storage lijiaxing 2023-12-01 14:14:55 +0800
03e53871a7 storage lijiaxing 2023-12-01 14:08:45 +0800
b7f721fffb storage lijiaxing 2023-12-01 11:14:34 +0800
3b7fb97e04 storage lijiaxing 2023-12-01 11:10:04 +0800
b3be333aa2

fix(ci): fix test model ckpt ci test (#518) Guoteng 2023-11-30 19:16:35 +0800
4467f827d1 add output embedding tf32 option JiaoPL 2023-11-30 16:49:00 +0800
b79d5ea7ae

test(workflow): add workflow for loss test and change trigger event (#513) kkscilife 2023-11-30 11:04:07 +0800
7f7d9d9a2c assign rank ali li126com 2023-11-29 12:05:06 +0000
90bf9adac4 assign rank ali li126com 2023-11-29 11:54:59 +0000
fb3006de1e assign rank ali li126com 2023-11-29 11:49:51 +0000
06cdcc3654 upload lijiaxing 2023-11-29 11:08:40 +0800
83ebebd5bc add grad_norm profiling interval && refactor save grad norm JiaoPL 2023-11-28 20:41:29 +0800
757e19e01a

1. fix(config): rampup_batch_size defalut value BC. (#515) Guoteng 2023-11-28 19:33:46 +0800
4a9c3c73ce optimize trigger event wangmengke 2023-11-28 17:21:29 +0800
4e4fb52898 multipart upload lijiaxing 2023-11-28 15:37:26 +0800
4eed07a3c3 compute vocab grad norm && save pt JiaoPL 2023-11-28 12:13:23 +0800
f37c8442f3 fix(ci): fix test model ckpt ci test 877825076@qq.com 2023-11-27 16:39:39 +0800
9780c44917 fix comments gaoyang07 2023-11-25 23:34:18 +0800
fdbdfcff34 remove micro_bsz gaoyang07 2023-11-25 22:44:20 +0800
06e8301861

name (#514) jiaxingli 2023-11-24 18:24:54 +0800
acf8fb9712 1. fix(config): rampup_batch_size defalut value BC. 2. fix(config): standardize config parameter access. 3. feat(launch): add warmup_process_group 4. feat(memory): add cuda_memory_analyze 877825076@qq.com 2023-11-24 16:45:18 +0800
19157361e0 fix the type_ids when micro_num=1 and use_flash_attn=False yingtongxiong 2023-11-24 16:44:17 +0800
05d0a8d821 name lijiaxing 2023-11-24 15:55:50 +0800
6549ebebdf change trigger event wangmengke 2023-11-24 15:33:38 +0800
c5ea82b074 add workflow for loss test wangmengke 2023-11-24 15:22:19 +0800
b59641715a

Feat(QA): Check loss when swapping micro_num and micro_bsz && Check grad norm (#510) jiaxingli 2023-11-24 12:05:14 +0800
0d3811c029

feat(model): add rope_base interface (#512) Shuo Zhang 2023-11-23 16:30:14 +0800
b12dd9621f feat(model): add rope_base interface Shuo Zhang 2023-11-23 15:34:50 +0800
4ed3388a2f check grad norm lijiaxing 2023-11-23 14:20:48 +0800
ed1d9c3b7c check grad norm lijiaxing 2023-11-23 14:16:48 +0800
61346c24f6 check loss lijiaxing 2023-11-22 10:29:32 +0800
35aa093afe Merge branch 'develop' of https://github.com/InternLM/InternLM into improve_unitest lijiaxing 2023-11-22 10:28:42 +0800
7776693373

feat(doc): add GPU memory info for 7B & 20B models (#507) jiaxingli 2023-11-21 19:20:02 +0800
f5aea7e08c

fix(timeout): larger timeout (#495) jiaopenglong 2023-11-21 19:19:22 +0800
972b4f02c0

update timeout thresholds jiaopenglong 2023-11-21 17:40:13 +0800
18a17a434c doc fix lijiaxing 2023-11-21 17:28:22 +0800
4aa7c21a76 doc fix lijiaxing 2023-11-21 17:14:53 +0800
8bd85a6f5e Merge branch 'develop' of https://github.com/InternLM/InternLM into improve_unitest lijiaxing 2023-11-17 16:56:16 +0800
f47ec9a34c memory_test lijiaxing 2023-11-17 16:53:17 +0800
eba2b859fc

feat(seed): set global seed for every model initialization (#496) v0.2.1dev20231121 jiaxingli 2023-11-17 14:42:50 +0800
679ed3c8ca

test(workflow): add model init test (#504) kkscilife 2023-11-17 09:59:34 +0800
0bfc86205e

feat(train): support_rampup_batch_size and fix bugs (#493) Guoteng 2023-11-16 19:51:01 +0800
4a6987d5e7

unitest_only_forward (#484) jiaxingli 2023-11-16 15:30:57 +0800
569988ac57 reduce timeout wangmengke 2023-11-16 15:21:04 +0800
81a02014d1 add model init test wangmengke 2023-11-16 15:15:20 +0800