Thanks @average-finland-92144
I managed to create a custom BPE tokenizer, which has 72344 token.
It was trained on “Sherlock_Homes” Novels.
One observation: The token count for my custom tokenizer is always less compared to tiktoken on average(for both gpt-2 and O1) for same text.
Not sure if this would help me when i start training my model.
Will keep you guys posted!!!