Additing gtf file

up vote
0
down vote

favorite

I had to get only ENSEMBLE non-chromosomal pseudogenes from given gtf file
add additional attribute field "filtered" with value "manually" for each of the annotated pseudogenes and save as new file. So I had to filter the given file by containing "ENSEMBLY" "pseudogenes" and not containing "Chr" save it in new file and add to the last column additional property(filter-manually). Could you tell me how can I do this using awk or sed preferably?

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

edited Nov 19 at 14:29

zx8754

28.6k76394

asked Nov 19 at 13:25

Sergei

New contributor

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

up vote
0
down vote

favorite

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

edited Nov 19 at 14:29

zx8754

28.6k76394

asked Nov 19 at 13:25

Sergei

New contributor

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

up vote
0
down vote

favorite

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

edited Nov 19 at 14:29

zx8754

28.6k76394

asked Nov 19 at 13:25

Sergei

New contributor

    ##description: evidence-based annotation of the human genome (GRCh38), version 29 (Ensembl 94)

##provider: GENCODE

##contact: gencode-help@ebi.ac.uk

##format: gtf

##date: 2018-08-30

chr1    HAVANA  gene    11869   14409   .       +       .       gene_id "ENSG00000223972.5"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; level 2; havana_gene "OTTHUMG00000000961.2";

chr1    HAVANA  transcript      11869   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name 

"DDX11L1-202"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    12613   12721   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  exon    13221   14409   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000456328.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "processed_transcript"; transcript_name "DDX11L1

-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000362751.1";

chr1    HAVANA  transcript      12010   13670   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; tr

anscript_name "DDX11L1-201"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12010   12057   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 1; exon_id "ENSE00001948541.1"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12179   12227   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unprocessed_pseudogene"; transcript

_name "DDX11L1-201"; exon_number 2; exon_id "ENSE00001671638.2"; level 2; transcript_support_level "NA"; ont "PGO:0000005"; ont "PGO:0000019"; tag "basic"; havana_gene "OTTHUMG00000000961.2"; havana_transcript "OTTHUMT00000002844.2";

chr1    HAVANA  exon    12613   12697   .       +       .       gene_id "ENSG00000223972.5"; transcript_id "ENST00000450305.2"; gene_type "transcribed_unprocessed_pseudogene"; gene_name "DDX11L1"; transcript_type "transcribed_unp

regex bash awk sed bioinformatics

edited Nov 19 at 14:29

zx8754

28.6k76394

asked Nov 19 at 13:25

Sergei

New contributor

edited Nov 19 at 14:29

zx8754

28.6k76394

asked Nov 19 at 13:25

Sergei

New contributor

edited Nov 19 at 14:29

zx8754

28.6k76394

edited Nov 19 at 14:29

zx8754

28.6k76394

edited Nov 19 at 14:29

zx8754

28.6k76394

asked Nov 19 at 13:25

Sergei

New contributor

asked Nov 19 at 13:25

Sergei

asked Nov 19 at 13:25

Sergei

New contributor

Sergei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

1

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

What have you already tried?
– Didier Trosset
Nov 19 at 13:46

What is the expected output?
– zx8754
Nov 19 at 14:30

Which lines in the example do describe a ENSEMBLE non-chromosomal pseudogene? and why (what are the related strings) ?
– Jay jargot
Nov 19 at 14:35

This are lines that match patterns:ENSEMBL exon 169224 169502 . - . gene_id "ENSG00000284215.2"; transcript_id "ENST00000639764.2"; gene_type "pseudogene"; gene_name "AC245056.4"; transcript_type "pseudogene"; transcript_name "AC245056.4-201"; exon_number 2; exon_id "ENSE00003804365.1"; level 3; tag "basic"; Filtered: manually;
– Sergei
Nov 19 at 15:05

Actually I have managed to do this but maybe there is better solution using only awk?
– Sergei
Nov 19 at 15:05

|
show 1 more comment

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

86.9k12121176

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Sergei is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53375630%2fadditing-gtf-file%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

86.9k12121176

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

86.9k12121176

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

up vote
1
down vote

accepted

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

86.9k12121176

If you are using Awk anyway, you don't need grep at all.

Also, less crucially, modifying $0 is mildly wasteful. print lets you specify precisely what you want to print.

awk '!/##/ && !/chr/ && /pseudogene/ && /ENSEMBL/ {

       print $0" Filtered: manually;"}' gencode.v29.chr_patch_hapl_scaff.basic.annotation.gtf > gencode.v29.filtered.gtf

answered Nov 19 at 15:36

tripleee

86.9k12121176

answered Nov 19 at 15:36

tripleee

86.9k12121176

answered Nov 19 at 15:36

tripleee

86.9k12121176

answered Nov 19 at 15:36

tripleee

86.9k12121176

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

Thanks, yes this is much better)
– Sergei
Nov 19 at 16:25

add a comment |

Sergei is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sergei is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg