epic_kitchens package¶
epic-kitchens
This library contains a variety of useful classes and functions for performing common operations on the EPIC Kitchens dataset.
epic_kitchens.dataset¶
epic_kitchens.dataset.epic_dataset module¶
-
class
epic_kitchens.dataset.epic_dataset.EpicVideoDataset(gulp_path, class_type, *, with_metadata=False, class_getter=None, segment_filter=None, sample_transform=None)[source]¶ Bases:
epic_kitchens.dataset.video_dataset.VideoDatasetVideoDataset for gulped RGB frames
-
__init__(gulp_path, class_type, *, with_metadata=False, class_getter=None, segment_filter=None, sample_transform=None)[source]¶ - Parameters
gulp_path (
Union[Path,str]) – Path to gulp directory containing the gulped EPIC RGB or flow framesclass_type (
str) – One of verb, noun, verb+noun, None, determines what label the segment returns.Noneshould be used for loading test datasets.with_metadata (
bool) – When True the segments will yield a tuple (metadata, class) where the class is defined by the class getter and the metadata is the raw dictionary stored in the gulp file.class_getter (
Optional[Callable[[Dict[str,Any]],Any]]) – Optionally provide a callable that takes in the gulp dict representing the segment from which you should return the class you wish the segment to have.segment_filter (
Optional[Callable[[VideoSegment],bool]]) – Optionally provide a callable that takes a segment and returns True if you want to keep the segment in the dataset, or False if you wish to exclude it.sample_transform (
Optional[Callable[[List[Image]],List[Image]]]) – Optionally provide a sample transform function which takes a list of PIL images and transforms each of them. This is applied on the frames just before returning fromload_frames().
- Return type
None
-
property
video_segments¶ List of video segments that are present in the dataset. The describe the start and stop times of the clip and its class.
- Return type
-
-
class
epic_kitchens.dataset.epic_dataset.EpicVideoFlowDataset(gulp_path, class_type, *, with_metadata=False, class_getter=None, segment_filter=None, sample_transform=None)[source]¶ Bases:
epic_kitchens.dataset.epic_dataset.EpicVideoDatasetVideoDataset for loading gulped flow. The loader assumes that flow \(u\), \(v\) frames are stored alternately in a flat manner: \([u_0, v_0, u_1, v_1, \ldots, u_n, v_n]\)
-
class
epic_kitchens.dataset.epic_dataset.GulpVideoSegment(gulp_metadata_dict, class_getter)[source]¶ Bases:
epic_kitchens.dataset.video_dataset.VideoSegmentSegmentRecord for a video segment stored in a gulp file.
- Assumes that the video segment has the following metadata in the gulp file:
id
num_frames
epic_kitchens.dataset.video_dataset module¶
-
class
epic_kitchens.dataset.video_dataset.VideoDataset(class_count, segment_filter=None, sample_transform=None)[source]¶ Bases:
abc.ABCA dataset interface for use with
TsnDataset. Implement this interface if you wish to use your dataset with TSN.We cannot use
torch.utils.data.Datasetbecause we need to yield information about the number of frames per video, which we can’t do with the standard torch.utils.data.Dataset.-
property
video_segments¶ - Return type
-
property
epic_kitchens.gulp¶
Dataset Adapters for GulpIO.
This module contains two adapters for ‘gulping’ both RGB and flow frames
which can then be used with the EpicVideoDataset classes.
epic_kitchens.gulp.adapter¶
-
class
epic_kitchens.gulp.adapter.EpicDatasetAdapter(video_segment_dir, annotations_df, frame_size=-1, extension='jpg', labelled=True)[source]¶ Bases:
gulpio.adapters.AbstractDatasetAdapterGulp Dataset Adapter for Gulping RGB frames extracted from the EPIC-KITCHENS dataset
-
__init__(video_segment_dir, annotations_df, frame_size=-1, extension='jpg', labelled=True)[source]¶ Gulp all action segments in
annotations_dfreading the dumped frames fromvideo_segment_dir- Parameters
video_segment_dir (
str) –Root directory containing segmented frames:
frame-segments/ ├── P01 │ ├── P01_01 │ | ├── P01_01_0_open-door │ | | ├── frame_0000000008.jpg │ | | ... │ | | ├── frame_0000000202.jpg │ | ... │ | ├── P01_01_329_put-down-plate │ | | ├── frame_0000098424.jpg │ | | ... │ | | ├── frame_0000098501.jpg │ ...
annotations_df (
DataFrame) – DataFrame containing labels to be gulped.frame_size (
int) – Size of shortest edge of the frame, if not already this size then it will be resized.extension (
str) – Extension of dumped frames.
- Return type
None
-
iter_data(slice_element=None)[source]¶ Get frames and metadata corresponding to segment
- Parameters
slice_element (optional) – If not specified all frames for the segment will be returned
- Yields
dict – dictionary with the fields
meta: All metadata corresponding to the segment, this is the same as the data in the labels csvframes: list ofPIL.Image.Imagecorresponding to the frames specified inslice_elementid: UID corresponding to segment
- Return type
-
-
class
epic_kitchens.gulp.adapter.EpicFlowDatasetAdapter(video_segment_dir, annotations_df, frame_size=-1, extension='jpg', labelled=True)[source]¶ Bases:
epic_kitchens.gulp.adapter.EpicDatasetAdapterGulp Dataset Adapter for Gulping flow frames extracted from the EPIC-KITCHENS dataset
-
iter_data(slice_element=None)[source]¶ Get frames and metadata corresponding to segment
- Parameters
slice_element (optional) – If not specified all frames for the segment will be returned
- Yields
dict – dictionary with the fields
meta: All metadata corresponding to the segment, this is the same as the data in the labels csvframes: list ofPIL.Image.Imagecorresponding to the frames specified inslice_elementid: UID corresponding to segment
-
epic_kitchens.gulp.visualisation¶
-
class
epic_kitchens.gulp.visualisation.FlowVisualiser(dataset)[source]¶ Bases:
epic_kitchens.gulp.visualisation.VisualiserVisualiser for video dataset containing optical flow \((u, v)\) frames
-
class
epic_kitchens.gulp.visualisation.RgbVisualiser(dataset)[source]¶ Bases:
epic_kitchens.gulp.visualisation.VisualiserVisualiser for video dataset containing RGB frames
-
epic_kitchens.gulp.visualisation.clipify_flow(frames, *, fps=30.0)[source]¶ Destack flow frames, join them side by side and then create a clip for display
epic_kitchens.meta¶
-
class
epic_kitchens.meta.Action(verb, noun)¶ Bases:
tuple-
property
noun¶ Alias for field number 1
-
property
verb¶ Alias for field number 0
-
property
-
class
epic_kitchens.meta.ActionClass(verb_class, noun_class)¶ Bases:
tuple-
property
noun_class¶ Alias for field number 1
-
property
verb_class¶ Alias for field number 0
-
property
-
epic_kitchens.meta.action_id_from_verb_noun(verb, noun)[source]¶ Map a verb and noun id to a dense action id.
Examples
>>> action_id_from_verb_noun(0, 0) 0 >>> action_id_from_verb_noun(0, 1) 1 >>> action_id_from_verb_noun(0, 351) 351 >>> action_id_from_verb_noun(1, 0) 352 >>> action_id_from_verb_noun(1, 1) 353 >>> action_id_from_verb_noun(np.array([0, 1, 2]), np.array([0, 1, 2])) array([ 0, 353, 706])
-
epic_kitchens.meta.action_tuples_to_ids(action_classes)[source]¶ Convert a list of action classes composed of a verb and noun class to a dense action id using the formula: \(c_v * 352 + c_n\)
- Parameters
action_classes (
Iterable[ActionClass]) –- Return type
- Returns
action_ids
-
epic_kitchens.meta.class_to_noun(cls)[source]¶ - Parameters
cls (
int) – numeric noun class- Return type
- Returns
Canonical noun representing the class
- Raises
IndexError – if
clsis an invalid noun class
-
epic_kitchens.meta.class_to_verb(cls)[source]¶ - Parameters
cls (
int) – numeric verb class- Return type
- Returns
Canonical verb representing the class
- Raises
IndexError – if
clsis an invalid verb class
-
epic_kitchens.meta.get_datadir()[source]¶ - Return type
- Returns
Directory under which any downloaded files are stored, defaults to current working directory
-
epic_kitchens.meta.is_many_shot_action(action_class)[source]¶ - Parameters
action_class (
ActionClass) –(verb_class, noun_class)tuple- Return type
- Returns
Whether action_class is many shot or not
-
epic_kitchens.meta.many_shot_actions()[source]¶ - Return type
- Returns
The set of actions classes that are many shot (verb_class appears more than 100 times in training, noun_class appears more than 100 times in training, and the action appears at least once in training).
-
epic_kitchens.meta.noun_classes()[source]¶ Get dataframe containing the mapping between numeric noun classes, the canonical noun of that class and nouns clustered into the class.
- Returns
Column Name
Type
Example
Description
noun_idint
2ID of the noun class.
class_keystring
pan:dustKey of the noun class.
nounslist of string (1 or more)
"['pan:dust', 'dustpan']"All nouns within the class (includes the key).
- Return type
Dataframe with the columns
-
epic_kitchens.meta.noun_id_from_action_id(action)[source]¶ Decode action id to verb id.
Examples
>>> noun_id_from_action_id(0) 0 >>> noun_id_from_action_id(1) 1 >>> noun_id_from_action_id(351) 351 >>> noun_id_from_action_id(352) 0 >>> noun_id_from_action_id(353) 1 >>> noun_id_from_action_id(352 + 351) 351 >>> noun_id_from_action_id(np.array([0, 1, 353])) array([0, 1, 1])
-
epic_kitchens.meta.noun_to_class(noun)[source]¶ - Parameters
noun (
str) – A noun from a narration- Return type
- Returns
The corresponding numeric class of the noun if it exists
- Raises
IndexError – If the noun doesn’t belong to any of the noun classes
-
epic_kitchens.meta.test_timestamps(split)[source]¶ - Parameters
split (
str) – ‘seen’, ‘unseen’, or ‘all’ (loads both with a ‘split’- Return type
DataFrame- Returns
Dataframe with the columns
Column Name
Type
Example
Description
uidint
1924Unique ID of the segment.
participant_idstring
P01ID of the participant.
video_idstring
P01_11Video the segment is in.
start_timestampstring
00:00:00.000Start time in
HH:mm:ss.SSSof the action.stop_timestampstring
00:00:01.890End time in
HH:mm:ss.SSSof the action.start_frameint
1Start frame of the action (WARNING only for frames extracted as detailed in annotations README).
stop_frameint
93End frame of the action (WARNING only for frames extracted as detailed in annotations README).
-
epic_kitchens.meta.training_labels()[source]¶ - Return type
DataFrame- Returns
Dataframe with the columns
Column Name
Type
Example
Description
uidint
6374Unique ID of the segment.
video_idstring
P03_01Video the segment is in.
narrationstring
close fridgeEnglish description of the action provided by the participant.
start_timestampstring
00:23:43.847Start time in
HH:mm:ss.SSSof the action.stop_timestampstring
00:23:47.212End time in
HH:mm:ss.SSSof the action.start_frameint
85430Start frame of the action (WARNING only for frames extracted as detailed in annotations README)
stop_frameint
85643End frame of the action (WARNING only for frames extracted as detailed in annotations README)
participant_idstring
P03ID of the participant.
verbstring
closeParsed verb from the narration.
nounstring
fridgeFirst parsed noun from the narration.
verb_classint
3Numeric ID of the parsed verb’s class.
noun_classint
10Numeric ID of the parsed noun’s class.
all_nounslist of string (1 or more)
['fridge']List of all parsed nouns from the narration.
all_nouns_classlist of int (1 or more)
[10]List of numeric IDs corresponding to all of the parsed nouns’ classes from the narration.
-
epic_kitchens.meta.training_narrations()[source]¶ - Return type
DataFrame- Returns
Dataframe with the columns
Column Name
Type
Example
Description
participant_idstring
P03ID of the participant.
video_idstring
P03_01Video the segment is in.
start_timestampstring
00:23:43.847Start time in
HH:mm:ss.SSSof the narration.stop_timestampstring
00:23:47.212End time in
HH:mm:ss.SSSof the narration.narrationstring
close fridgeEnglish description of the action provided by the participant.
-
epic_kitchens.meta.training_object_labels()[source]¶ - Return type
DataFrame- Returns
Dataframe with the columns
Column Name
Type
Example
Description
noun_classint
20Integer value representing the class in noun-classes.csv.
nounstring
bagOriginal string name for the object.
participant_idstring
P01ID of participant.
video_idstring
P01_01Video the object was annotated in.
frameint
056581Frame number of the annotated object.
bounding_boxeslist of 4-tuple (0 or more)
"[(76, 1260, 462, 186)]"Annotated boxes with format
(<top:int>,<left:int>,<height:int>,<width:int>).
-
epic_kitchens.meta.verb_classes()[source]¶ Get dataframe containing the mapping between numeric verb classes, the canonical verb of that class and verbs clustered into the class.
- Return type
DataFrame- Returns
Dataframe with the columns
Column Name
Type
Example
Description
verb_idint
3ID of the verb class.
class_keystring
closeKey of the verb class.
verbslist of string (1 or more)
"['close', 'close-off', 'shut']"All verbs within the class (includes the key).
-
epic_kitchens.meta.verb_id_from_action_id(action_id)[source]¶ Decode action id to noun id. :type action_id:
Union[int,ndarray] :param action_id: Either a single action id, or anp.ndarrayof action ids.Examples
>>> verb_id_from_action_id(0) 0 >>> verb_id_from_action_id(1) 0 >>> verb_id_from_action_id(352) 1 >>> verb_id_from_action_id(353) 1 >>> verb_id_from_action_id(np.array([0, 352, 1, 353])) array([0, 1, 0, 1])
-
epic_kitchens.meta.verb_to_class(verb)[source]¶ - Parameters
verb (
str) – A noun from a narration- Return type
- Returns
The corresponding numeric class of the verb if it exists
- Raises
IndexError – If the verb doesn’t belong to any of the verb classes
-
epic_kitchens.meta.video_descriptions()[source]¶ - Return type
DataFrame- Returns
High level description of the task trying to be accomplished in a video.
Column Name
Type
Example
Description
video_idstring
P01_01ID of the video.
datestring
30/04/2017Date on which the video was shot.
timestring
13:49:00Local recording time of the video.
descriptionstring
prepared breakfast with soy milk and cerealsDescription of the activities contained in the video.
-
epic_kitchens.meta.video_info()[source]¶ - Return type
DataFrame- Returns
Technical information stating the resolution, duration and FPS of each video.
Column Name
Type
Example
Description
videostring
P01_01Video ID
resolutionstring
1920x1080Resolution of the video, format is
WIDTHxHEIGHTdurationfloat
1652.152817Duration of the video, in seconds
fpsfloat
59.9400599400599Frame rate of the video
epic_kitchens.metrics¶
-
epic_kitchens.metrics.compute_class_agnostic_metrics(groundtruth_df, ranks, many_shot_verbs=None, many_shot_nouns=None, many_shot_actions=None)[source]¶ Compute class agnostic metrics (many-shot precision and recall) from ranks.
- Parameters
groundtruth_df (
DataFrame) – DataFrame containing'verb_class':int,'noun_class':intand'action_class':intcolumns.ranks (
Dict[str,ndarray]) – Dictionary containing three entries:'verb','noun'and'action'. Entries should map to a 2Dnp.ndarrayof shape(n_instances, n_classes)where the index is the predicted rank of the class at that index.many_shot_verbs (
Optional[ndarray]) – The set of verb classes that are considered many shot. If not provided they are loaded fromepic_kitchens.meta.many_shot_verbs()many_shot_nouns (
Optional[ndarray]) – The set of noun classes that are considered many shot. If not provided they are loaded fromepic_kitchens.meta.many_shot_nouns()many_shot_actions (
Optional[ndarray]) – The set of action classes that are considered many shot. If not provided they are loaded fromepic_kitchens.meta.many_shot_actions()
- Return type
- Returns
Dictionary with the structure:
precision: verb: float noun: float action: float verb_per_class: dict[str:float, length = n_verbs] recall: verb: float noun: float action: float verb_per_class: dict[str:float, length = n_verbs]
The
'verb','noun', and'action'entries of the metric dictionaries are the macro-averaged mean precision/recall over the set of many shot classes, whereas the ‘verb_per_class’ entry is a breakdown for each verb_class in the format of a dictionary mapping stringified verb class to that class’ precision/recall.
-
epic_kitchens.metrics.compute_class_aware_metrics(groundtruth_df, ranks, top_k=(1, 5))[source]¶ Compute class aware metrics (accuracy @ 1/5) from ranks.
- Parameters
groundtruth_df (
DataFrame) – DataFrame containing'verb_class':int,'noun_class':intand'action_class':intcolumns.ranks (
Dict[str,ndarray]) – Dictionary containing three entries:'verb','noun'and'action'. Entries should map to a 2Dnp.ndarrayof shape(n_instances, n_classes)where the index is the predicted rank of the class at that index.top_k (
Union[int,Tuple[int, …]]) – The set of k values to compute top-k accuracy for.
- Return type
- Returns
Dictionary with the structure:
verb: list[float, length = len(top_k)] noun: list[float, length = len(top_k)] action: list[float, length = len(top_k)]
-
epic_kitchens.metrics.compute_metrics(groundtruth_df, scores, many_shot_verbs=None, many_shot_nouns=None, many_shot_actions=None, action_priors=None)[source]¶ Compute the EPIC action recognition evaluation metrics from
scoresgiven ground truth labels ingroundtruth_df.- Parameters
groundtruth_df (
DataFrame) – DataFrame containingverb_class:int,noun_class:int. This function will add anaction_classcolumn containing the action ID obtained fromepic_kitchens.meta.action_id_from_verb_noun().scores (
Dict[str,Union[ndarray,Dict[int,float]]]) – Dictionary containing:'verb','noun'and (optionally)'action'entries.'verb'and'noun'should map to a 2Dnp.ndarrayof shape(n_instances, n_classes)where each element is the predicted score of that class.'action'should map to a dictionary of action keys to scores. The order of the scores array should be the same as the order ingroundtruth_df.many_shot_verbs (
Optional[ndarray]) – The set of verb classes that are considered many shot. If not provided they are loaded fromepic_kitchens.meta.many_shot_verbs()many_shot_nouns (
Optional[ndarray]) – The set of noun classes that are considered many shot. If not provided they are loaded fromepic_kitchens.meta.many_shot_nouns()many_shot_actions (
Optional[ndarray]) – The set of action classes that are considered many shot. If not provided they are loaded fromepic_kitchens.meta.many_shot_actions()action_priors (
Optional[ndarray]) – A(n_verbs, n_nouns)shaped array containing the action prior used to weight action predictions.
- Return type
- Returns
A dictionary containing all metrics with the following structure:
accuracy: verb: list[float, length 2] noun: list[float, length 2] action: list[float, length 2] precision: verb: float noun: float action: float recall: verb: float noun: float action: float
Accuracy lists contain the top-k metrics like so
[top_1, top_5], the precision and recall metrics are macro averaged and computed over the many-shot classes.- Raises
ValueError – If the shapes of the
scoresarrays are not correct, or the lengths ofgroundtruth_dfand thescoresarrays are not equal, or ifgrountruth_dfdoesn’t have the specified columns.
-
epic_kitchens.metrics.precision_recall(rankings, labels, classes=None)[source]¶ Computes precision and recall from rankings.
- Parameters
- Return type
- Returns
Tuple of
(precision, recall)whereprecisionis a 1D array of shape(len(classes),), andrecallis a 1D array of shape(len(classes),)- Raises
ValueError – If the dimensionality of the
rankingsorlabelsis incorrect, or if the length of therankingsandlabelsare not equal, or if the set of the providedclassesis not a subset of the classes present inlabels.
-
epic_kitchens.metrics.topk_accuracy(rankings, labels, ks=(1, 5))[source]¶ Computes top-k accuracies for different values of k from rankings.
- Parameters
- Return type
- Returns
Top-k accuracy for each
kinks. If only onekis provided, then only a single float is returned.- Raises
ValueError – If the dimensionality of the
rankingsorlabelsis incorrect, or if the length ofrankingsandlabelsaren’t equal.
epic_kitchens.scoring¶
-
epic_kitchens.scoring.compute_action_scores(verb_scores, noun_scores, top_k=100, action_priors=None)[source]¶ Given the predicted verb and noun scores, compute action scores by \(p(A = (v, n)) = p(V = v)p(N = n)\).
- Parameters
verb_scores (
ndarray) – 2D array of verb scores(n_instances, n_verbs).noun_scores (
ndarray) – 2D array of noun scores(n_instances, n_nouns).top_k (
int) – Number of highest scored actions to compute.action_priors (
Optional[ndarray]) – 2D array of action priors(n_verbs, n_nouns). These don’t have to sum to one and as such you can provide the training counts of \((v, n)\) occurrences (to minimize numerical stability issues).
- Return type
- Returns
A tuple
((verbs, noun), action_scores)whereverbsandnounsare 2D arrays of shape(n_instances, top_k)containing the classes constituting the top-k action scores.action_scoresis a 2D array of shape(n_instances, top_k)whereaction_scores[i, j]corresponds to the score for the action class(verbs[i, j], nouns[i, j]). The scores are sorted in descending order, i.e.action_scores[i, j] >= action_scores[i, j + 1].
-
epic_kitchens.scoring.scores_dict_to_ranks(scores_dict)[source]¶ Convert a dictionary of task to scores to a dictionary of task to ranks
-
epic_kitchens.scoring.scores_to_ranks(scores)[source]¶ Convert scores to ranks
- Parameters
scores (
Union[ndarray,List[Dict[int,float]]]) – A 2D array of scores of shape(n_instances, n_classes)or a list of dictionaries, where each dictionary represents the sparse scores for a task. The key: value pairs of the dictionary represent the class: score mapping.- Return type
- Returns
A 2D array of ranks
(n_instances, n_classes). Each row contains the ranked classes in descending order, i.e.ranks[0, i]is ranked higher thanranks[0, i+1]. The index is the rank, and the element the class at that rank.
-
epic_kitchens.scoring.softmax(x)[source]¶ Compute the softmax of the 1D or 2D array
x.- Parameters
x (
ndarray) – a 1D or 2D array. If 1D, then it is assumed that it is a single class score vector. Otherwise, ifxis 2D, then each row is assumed to be a class score vector.
Examples
>>> res = softmax(np.array([0, 200, 10])) >>> np.sum(res) 1.0 >>> np.all(np.abs(res - np.array([0, 1, 0])) < 0.0001) True >>> res = softmax(np.array([[0, 200, 10], [0, 10, 200], [200, 0, 10]])) >>> np.argsort(res, axis=1) array([[0, 2, 1], [0, 1, 2], [1, 2, 0]]) >>> np.sum(res, axis=1) array([1., 1., 1.]) >>> res = softmax(np.array([[0, 200, 10], [0, 10, 200]])) >>> np.sum(res, axis=1) array([1., 1.])
- Return type
-
epic_kitchens.scoring.top_scores(scores, top_k=100)[source]¶ Return the
top_kclass indices and scores in descending order.- Parameters
- Return type
- Returns
A tuple containing two arrays,
(ranked_classes, scores)where ranked_classes contains the classes in descending order of score, andscorescontains the corresponding score for each class, i.e.ranked_classes[..., i]has scorescores[..., i].
Examples
>>> top_scores(np.array([0.2, 0.6, 0.1, 0.04, 0.06]), top_k=3) (array([1, 0, 2]), array([0.6, 0.2, 0.1]))
epic_kitchens.preprocessing¶
Pre-processing tools to munge data into a format suitable for training
epic_kitchens.preprocessing.split_segments¶
Program for splitting frames into action segments See Action segmentation for usage details
epic_kitchens.labels¶
Column names present in a labels dataframe.
Rather than accessing column names directly, we suggest you import these constants and use them to access the data in case the names change at any point.
-
epic_kitchens.labels.NARRATION_COL= 'narration'¶ Start timestamp column name, the timestamp of the start of the action segment
e.g.
"00:23:43.847"
-
epic_kitchens.labels.NOUNS_CLASS_COL= 'all_noun_classes'¶ The noun class corresponding to an action without a noun, consider the narration “stir” where no object is specified.
-
epic_kitchens.labels.NOUNS_COL= 'all_nouns'¶ Nouns class column name, the classes corresponding to each noun extracted from the narration
e.g.
[10]
-
epic_kitchens.labels.NOUN_CLASS_COL= 'noun_class'¶ Nouns column name, all nouns extracted from the narration
e.g.
["fridge"]
-
epic_kitchens.labels.NOUN_COL= 'noun'¶ Noun class column name, the class corresponding to the first noun extracted from the narration
e.g.
10
-
epic_kitchens.labels.PARTICIPANT_ID_COL= 'participant_id'¶ Verb column name, the first verb extracted from the narration
e.g.
"close"
-
epic_kitchens.labels.START_F_COL= 'start_frame'¶ Stop frame column name, the frame corresponding to the starting timestamp
e.g.
85643
-
epic_kitchens.labels.START_TS_COL= 'start_timestamp'¶ Stop timestamp column name, the timestamp of the end of the action segment
e.g.
"00:23:47.212"
-
epic_kitchens.labels.STOP_F_COL= 'stop_frame'¶ Participant ID column name, the identifier corresponding to an individual
e.g.
85643
-
epic_kitchens.labels.STOP_TS_COL= 'stop_timestamp'¶ Start frame column name, the frame corresponding to the starting timestamp
e.g.
85430
-
epic_kitchens.labels.UID_COL= 'uid'¶ Video column name, an identifier for a specific video of the form Pdd_dd, the first two digits are the participant ID, and the last two digits the video ID
e.g.
"P03_01"
-
epic_kitchens.labels.VERB_CLASS_COL= 'verb_class'¶ Noun column name, the first noun extracted from the narration
e.g.
"fridge"
-
epic_kitchens.labels.VERB_COL= 'verb'¶ Verb class column name, the class corresponding to the verb extracted from the narration.
e.g.
3
-
epic_kitchens.labels.VIDEO_ID_COL= 'video_id'¶ Narration column name, the original narration by the participant about the action performed
e.g.
"close fridge"
epic_kitchens.time¶
Functions for converting between frames and timestamps
-
epic_kitchens.time.flow_frame_count(rgb_frame, stride, dilation)[source]¶ Get the number of frames in a optical flow segment given the number of frames in the corresponding rgb segment from which the flow was extracted with parameters
(stride, dilation)- Parameters
- Return type
- Returns
The number of optical flow frames
Examples
>>> flow_frame_count(6, 1, 1) 5 >>> flow_frame_count(6, 2, 1) 3 >>> flow_frame_count(6, 1, 2) 4 >>> flow_frame_count(6, 2, 2) 2 >>> flow_frame_count(6, 3, 1) 2 >>> flow_frame_count(6, 1, 3) 3 >>> flow_frame_count(7, 1, 1) 6 >>> flow_frame_count(7, 2, 1) 3 >>> flow_frame_count(7, 1, 2) 5 >>> flow_frame_count(7, 2, 2) 3 >>> flow_frame_count(7, 3, 1) 2 >>> flow_frame_count(7, 1, 3) 4
-
epic_kitchens.time.seconds_to_timestamp(total_seconds)[source]¶ Convert seconds into a timestamp
- Parameters
total_seconds (
float) – time in seconds- Return type
- Returns
timestamp representing
total_seconds
Examples
>>> seconds_to_timestamp(1) '00:00:1.000' >>> seconds_to_timestamp(1.1) '00:00:1.100' >>> seconds_to_timestamp(60) '00:01:0.000' >>> seconds_to_timestamp(61) '00:01:1.000' >>> seconds_to_timestamp(60 * 60 + 1) '01:00:1.000' >>> seconds_to_timestamp(60 * 60 + 60 + 1) '01:01:1.000' >>> seconds_to_timestamp(1225.78500002) '00:20:25.785'
-
epic_kitchens.time.timestamp_to_frame(timestamp, fps)[source]¶ Convert timestamp to frame number given the FPS of the extracted frames
- Parameters
- Return type
- Returns
frame corresponding timestamp
Examples
>>> timestamp_to_frame("00:00:00", 29.97) 1 >>> timestamp_to_frame("00:00:01", 29.97) 29 >>> timestamp_to_frame("00:00:01", 59.94) 59 >>> timestamp_to_frame("00:01:00", 60) 3600 >>> timestamp_to_frame("01:00:00", 60) 216000
-
epic_kitchens.time.timestamp_to_seconds(timestamp)[source]¶ Convert a timestamp into total number of seconds
- Parameters
timestamp (
str) – formatted asHH:MM:SS[.FractionalPart]- Return type
- Returns
timestampconverted to seconds
Examples
>>> timestamp_to_seconds("00:00:00") 0.0 >>> timestamp_to_seconds("00:00:05") 5.0 >>> timestamp_to_seconds("00:00:05.5") 5.5 >>> timestamp_to_seconds("00:01:05.5") 65.5 >>> timestamp_to_seconds("01:01:05.5") 3665.5
epic_kitchens.video¶
-
class
epic_kitchens.video.FlowModalityIterator(dilation=1, stride=1, bound=20, rgb_fps=59.94)[source]¶ Bases:
epic_kitchens.video.ModalityIteratorIterator for optical flow \((u, v)\) frames
-
class
epic_kitchens.video.ModalityIterator[source]¶ Bases:
abc.ABCInterface that a modality extracted from video must implement
-
class
epic_kitchens.video.RGBModalityIterator(fps)[source]¶ Bases:
epic_kitchens.video.ModalityIteratorIterator for RGB frames
-
epic_kitchens.video.get_narration(annotation)[source]¶ Get narration from annotation row, defaults to
"unnarrated"if row has no narration column.
-
epic_kitchens.video.iterate_frame_dir(root)[source]¶ Iterate over a directory of video dirs with the hierarchy
root/P01/P01_01/
-
epic_kitchens.video.split_dataset_frames(modality_iterator, frames_dir, segment_root_dir, annotations, frame_format='frame%06d.jpg', pattern=re.compile('.*'))[source]¶ Split dumped video frames from
frames_dirinto directories withinsegment_root_dirfor each video segment defined inannotations.- Parameters
modality_iterator (
ModalityIterator) – Modality iterator of framesframes_dir (
Path) – Directory containing dumped framessegment_root_dir (
Path) – Directory to write split segments toannotations (
DataFrame) – Dataframe containing segment informationframe_format (str, optional) – Old style string format that must contain a single
%dformatter describing file name format of the dumped frames.pattern (re.Pattern, optional) – Regexp to match video directories
- Return type
None
-
epic_kitchens.video.split_video_frames(modality_iterator, frame_format, video_annotations, segment_root_dir, video_dir)[source]¶ Split frames from a single video file stored in
video_dirinto segment directories stored insegment_root_dir.- Parameters
modality_iterator (
ModalityIterator) – Modality iteratorframe_format (
str) – Old style string format that must contain a single%dformatter describing file name format of the dumped frames.video_annotations (
DataFrame) – Dataframe containing rows only corresponding to video frames stored invideo_dirsegment_root_dir (
Path) – Directory to write split segments tovideo_dir (
Path) – Directory containing dumped frames for a single video
- Return type
None