Skip to content

Instantly share code, notes, and snippets.

@antferdom
Last active September 15, 2023 09:28

Revisions

  1. antferdom revised this gist Sep 15, 2023. 1 changed file with 5 additions and 14 deletions.
    19 changes: 5 additions & 14 deletions transformer_parameters.py
    Original file line number Diff line number Diff line change
    @@ -1,15 +1,6 @@
    import torch
    from transformers import AutoModelForCausalLM

    model = AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b',
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    )
    model.eval()
    model.cuda()

    for name, param in model.named_parameters():
    print(f"{name} Modelsize: {param.numel()/1000**2:.1f}M parameters")
    if "31" not in name:
    param.requires_grad = False
    print(name, param.requires_grad)

    model_size = sum(
    p.numel() * p.element_size() for p in model.parameters() if p.requires_grad
    )
    model_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
  2. antferdom created this gist May 24, 2023.
    15 changes: 15 additions & 0 deletions transformer_parameters.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,15 @@
    import torch
    from transformers import AutoModelForCausalLM

    model = AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b',
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    )
    model.eval()
    model.cuda()

    for name, param in model.named_parameters():
    print(f"{name} Modelsize: {param.numel()/1000**2:.1f}M parameters")
    if "31" not in name:
    param.requires_grad = False
    print(name, param.requires_grad)