Welcome to DaisyRec-v2.0’s Documentation!
The description of all parameters is listed below.
Basic Settings
–problem_type
define a point-wise or pair-wise problem.
point-wise: point-wise algorithm
pair-wise: pair-wise algorithm
–optimization_metric
the metric to be optimized for hyper-parameter tuning via HyperOpt
ndcg
precision
recall
hr
map
mrr
–hyperopt_trail
the number of trails of HyperOpt
–hyperopt_pack
record the searching space of hyper-parameters for HyperOpt
–algo_name
the algorithm to be executed
mostpop
itemknn
puresvd
slim
mf
fm
neumf
nfm
ngcf
multi-vae
–dataset
the dataset to be evaluated
ml-100k
ml-1m
ml-10m
ml-20m
lastfm
book-x
amazon-cloth
amazon-electronic
amazon-book
amazon-music
epinions
yelp
citeulike
netflix
–prepro
the data pre-processing strategy
origin: adopt the raw data
Fcore: recursively filter users and items that have interactions no less than N, e.g., 5core
Ffilter: only filter users and items that have interactions no less than N once, e.g., 5filter
–val_method
training and validation data splitting strategy
tsbr: time-aware split-by-ratio
rsbr: random-aware split-by-ratio
tloo: time-aware leave-one-out
rloo: random-aware leave-one-out
–test_method
training and test data splitting strategy, which should be consistent with the settings for val_method
–val_size
ratio of validation set size in the range of (0,1), e.g., 0.1 means retaining 10% of training data as validation data
–test_size
ratio of test set size in the range of (0,1), e.g., 0.2 means retaining 20% of the whole data as test data
–topk
the length of recommendation list
–fold_num
the fold number of cross-validation
–cand_num
the number of candidate items used for ranking
–sample_method
negative sampling strategy
uniform: uniformly sample negative items
low-pop: sample popular items with low rank
high-pop: sample popular items with high rank
–sample_ratio
control the ratio of popularity sampling for the hybrid sampling strategy in the range of (0,1), e.g., for the hybrid sampling strategy uniform+low-pop, –sample_ratio=0.1 means 10% of the negative items are sampled via low-pop
–num_ng
the number of negative samples
–positive_threshold
the threshold for binarizing the ratings into positve samples (for exmaple if the threshold = 4, it means the items with ratings no less than 4 will be treated as positive items)
–loss_type
type of loss function
CL: cross-entropy loss for point-wise problem
SL: square error loss for point-wise problem
BPR: BPR loss for pair-wise problem
HL: hinge loss for pair-wise problem
TL: top-1 Loss for pair-wise problem
–gpu
the ID of GPU card
Algorithm Specific Settings
–init_method
parameter initializers
default: initialize parameters according to the original paper
normal: initialize parameters with normal distribution
uniform: initialize parameters with uniform distribution
xavier_normal: initialize parameters with xavier_normal distribution
xavier_uniform: initialize parameters with xavier_uniform distribution
–optimizer
optimization method for training the algorithms
default (optimizer in the original paper)
sgd
adam
adagrad
–early_stop
whether to activate the early-stop mechanism
true
false
–tune_testset
whether to directly tune on testset, and the default value is false
true
false
–factors
the dimension of latent factors (embeddings)
–reg_1
the coefficient of L1 regularization
–reg_2
the coefficient of L2 regularization
–dropout
dropout rate
–lr
learning rate
–epochs
training epochs
–batch_size
batch size for training
–num_layers
number of layers for MLP
–alpha
constant to multiply the penalty terms for SLIM
–elastic
the ElasticNet mixing parameter for SLIM in the range of (0,1)
–pop_n
the preliminary selected top-n popular candidate items to reduce the time complexity for MostPop
–maxk
the number of neighbors to take into account for ItemKNN
–node_dropout
node dropout ratio for NGCF
–mess_dropout
message dropout ratio for NGCF
–kl_reg
the coefficient of KL regularization for Multi-VAE