Hello all I m working on an NLP project for finding semantic Flyte #ml-and-mlops-questions

Hello all, I'm working on an NLP project for findi...

quaint-midnight-92440

05/24/2025, 6:26 AM

Hello all, I'm working on an NLP project for finding semantic textual similarity between 2 text paragraphs. So far, I have built a model that assesses the similarity value on a range of 0-1 between each pair of sentences from these 2 paragraphs. Not sure if this is the right channel to ask this question, but I'm just wondering if anyone has experience deploying the model in a cloud service provider by exposing it as a Server API endpoint for predicting the similarity score?

freezing-airport-6809

05/25/2025, 12:07 AM

Try https://signup.union.ai/ And it should be easy https://www.union.ai/docs/byoc/user-guide/core-concepts/serving/serving-a-model/

quaint-midnight-92440

05/25/2025, 3:04 AM

@freezing-airport-6809 Thanks for providing the links. I will go through these links and check if they work in my use case.

powerful-horse-58724

05/27/2025, 12:19 AM

@quaint-midnight-92440 let me know if you have any questions on using Union serverless for trying that!

quaint-midnight-92440

05/27/2025, 2:00 AM

@powerful-horse-58724 Sure

quaint-midnight-92440

05/29/2025, 3:23 AM

@powerful-horse-58724 While signing up on Union.ai, it requires access to my Github account. I'm having issues uploading the pre-trained model file, which is larger than 3 GB, to my repository on GitHub. Despite trying with LFS too, it's still not accepting the huge file. How should I resolve this issue?

powerful-horse-58724

05/29/2025, 6:26 AM

I can send you an example tomorrow, but I'd recommend looking at hugging face hub to store models / data instead of GitHub. You can pull it down and store it as a union artifact as well for easier access in your union workflows, but currently in serverless there is a 30 day retention for data/artifacts.

quaint-midnight-92440

05/29/2025, 7:28 AM

@powerful-horse-58724 Thanks for your swift response. I will research Hugging Face Hub like you suggested and experiment with it in the meantime.

powerful-horse-58724

05/29/2025, 5:40 PM

You can actually probably just upload to the hub directly from the UI which might be easier. But you can also upload via their SDK: This is task I've used in union, but you can locally or remove the Flyte/Union part if you just want a local upload script. • Change the path to your local model path • repo_name == `yourHFusername/reponameyouwant

Copy code

@task(
    container_image=container_image,
    requests=Resources(cpu="2", mem="2Gi"),
    secret_requests=[Secret(group=None, key="hf_token")],
)
def upload_model_to_hub(model: torch.nn.Module, repo_name: str) -> str:
    from huggingface_hub import HfApi

    # Get the Flyte context and define the model path
    ctx = current_context()
    model_path = "best_model.pth"  # Save the model locally as "best_model.pth"

    # Save the model's state dictionary
    torch.save(model.state_dict(), model_path)

    # Set Hugging Face token from local environment or Flyte secrets
    hf_token = os.getenv("HF_TOKEN")
    if hf_token is None:
        # If HF_TOKEN is not found, attempt to get it from the Flyte secrets
        hf_token = ctx.secrets.get(key="hf_token")
        print("Using Hugging Face token from Flyte secrets.")
    else:
        print("Using Hugging Face token from environment variable.")

    # Create a new repository (if it doesn't exist) on Hugging Face Hub
    api = HfApi()
    api.create_repo(repo_name, token=hf_token, exist_ok=True)

    # Upload the model to the Hugging Face repository
    api.upload_file(
        path_or_fileobj=model_path,  # Path to the local file
        path_in_repo="pytorch_model.bin",  # Destination path in the repo
        repo_id=repo_name,
        commit_message="Upload Faster R-CNN model",
        token=hf_token,
    )

    return f"Model uploaded to Hugging Face Hub: <https://huggingface.co/{repo_name}>"

powerful-horse-58724

05/29/2025, 5:41 PM

When you want to download in a union task you can adjust this to your model repo / model type. (idk if you're using LLM, computer vision, etc)

Copy code

@task(
    container_image=container_image,
    cache=True,
    cache_version="1",
    requests=Resources(cpu="2", mem="2Gi"),
)
def download_model(model_name: str) -> FlyteDirectory:
    from transformers import AutoModelForSequenceClassification, AutoTokenizer

    working_dir = Path(current_context().working_directory)
    saved_model_dir = working_dir / "saved_model"
    saved_model_dir.mkdir(parents=True, exist_ok=True)

    model = AutoModelForSequenceClassification.from_pretrained(
        model_name,
        device_map="cpu",
        torch_dtype="auto",
        trust_remote_code=True,
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name)

    model.save_pretrained(saved_model_dir)
    tokenizer.save_pretrained(saved_model_dir)

    return FlyteDirectory(saved_model_dir)

quaint-midnight-92440

05/30/2025, 10:31 AM

Thanks for providing the code. I actually managed to deploy my app eventually, but will keep this in mind for future projects.

7 Views

Open in Slack

Previous Next