Searching for TFBS with PFM

Task Name: pfm-search

This task searches for transcription factor binding sites (TFBS) using position weight matrices (PWM) converted from input position frequency matrices (PFM) and saves the identified regions as annotations.

Parameters:

  • seq — A semicolon-separated list of input sequence files to search for TFBS. [String, Required]

  • matrix — A semicolon-separated list of the input PFM files. [String, Required]

  • out — The output Genbank file.

  • name — The name of the annotated regions. [String, Optional, Default: “misc_feature”]

  • type — The type of the matrix. [Boolean, Optional, Default: false]

    The following values are available:

    • true (dinucleic type)
    • false (mononucleic type)

    Dinucleic matrices provide more detail, while mononucleic matrices are more useful for smaller input datasets.

  • algo — The algorithm used to convert a PFM to a PWM. [String, Optional, Default: “Berg and von Hippel”]

    The following values are available:

    • Berg and von Hippel
    • Log-odds
    • Match
    • NLG
  • score — The minimum percentage score to detect TFBS. [Number, Optional, Default: 85]

  • strand — The strands to search in. [Number, Optional, Default: 0]

    The following values are available:

    • 0 (both strands)
    • 1 (direct strand)
    • -1 (complement strand)

Example:

ugene pfm-search --seq=in.fa --matrix=MA0265.1.pfm;MA0266.1.pfm --out=res.gb