I know you're trying to make things functional for both Linux & MacOS, so some care would be required there; but I'd recommend taking a look at GNU getopt or getopts for argument processing & parsing.
Additionally, I'd personally set default values to variables set in the argument parse loop before the loop occurrs -- that way you don't need to check if a variable is undefined before setting a value.
Here's how I'd rewrite the argparse loop & variable definitions. This would have the added benefit of allowing the user to condense multiple short options into on string (eg -ku)...though, YMMV with MacOS support, as I'm wholly unfamiliar with that platform):
mode="fzf"
mirrornum="3"
view="1"
tokindle="0"
tousb="0"
opts="$(getopts --options hrm:Vku --longoptions help,rofi,mirror:,no-view,to-kindle,to-usb --name "${0##*/}" -- "${@}")"
eval set -- "${opts}"
while true; do
case "${1}" in
-h | --help ) show_help; exit 0;;
-r | --rofi ) mode="rofi";;
-m | --mirror ) mirrornum="${2}"; shift;;
-V | --no-view ) view="0";;
-k | --to-kindle ) tokindle="1"; view="0";;
-u | --to-usb ) tousb="1"; view="0";;
-- ) shift; break;;
* ) query="${*}"; break;;
esac
shift
doneIn general, I'd recommend using printf over echo most of the time -- including for returning data from a function (source). Better control and more reliable, imo.
One exception would be for when you need to output a big block of text, as its usually easier to use a heredoc -- like for help text:
cat << EOF
Usage: libby [--rofi] [--mirror <num>] [--no-view] [--to-kindle] <query>
--rofi: use rofi to select a book (default is fzf)"
--mirror <num>: use an alternative libgen mirror (default: 3)"
--no-view: don't view the result"
--to-kindle: convert file to .mobi and send to \$KINDLE_EMAIL via mutt/neomutt (implies --no-view)"
--to-usb: copy file to root of \$USB via rsync (implies --no-view)"
EOFAlso semi related, you can use parameter expansion to reference the script's filename directly, without needing to hardcode it:
Usage: ${0##*/} [--rofi] [--mirror <num>] [--no-view] [--to-kindle] <query>Useful if you want to make a generic template.
I'm glad to see you're using a function to group the download operation(s), though I wanted to point out that all of the variables defined in this function are global to the script -- meaning they will still be set after the function returns. That may not be a problem, but its worth noting going forward.
Personally, I try to always declare my function variables with local to ensure proper scoping.
You don't need to cat a file into sed or similar tools; as they take a file as one of their arguments:
sed 's/;0;//g' /tmp/libgen.txtIf you'd rather send the contents of the file directly to stdin, you can also use file redirection:
sed 's/;0;//g' < /tmp/libgen.txtAlso, just wanted to biefly mention that pup also supports reading from a file directly, though with the -f|--file flag:
pup --file "/path/to/html" "tr:nth-of-type($trr) td:nth-of-type(3) text{}"Similar to the cat example, also do not need to echo a variable into sed or similar tools; instead, you can use herestrings to send the contents directly to stdin:
pup "tr:nth-of-type($trr) td:nth-of-type(3) text{}" <<< "$bookdata"
jq -sRr '@uri' <<< "${query}"tr vs Parameter Expansion
Kinda similar to the notes on cat & echo, you also do not need tr to convert the contents of a variable into upper or lower case, instead you can use parameter expansion:
# To Lower
${title,,}
# To Upper
${title^^}Likewise, you can also do search-replace operations directly with parameter expansion & pattern matching:
# Replace First
${HOME/[:alnum:._-]*/}
# Replace All
${HOME//[:alnum:._-]*/}You can combine sed statements/operations into a single call with ; as a delimiter between them:
sed -n "${line}p" "/tmp/libgen.txt" | \
sed 's/.*;2;//;s/;3;.*//g'Though, its worth noting that you still need to split things up if you explicitly need to operate on the output of another statement
I'd highly recommend taking a look at awk to replace some of these sed calls -- I think it could really help clean things up and make it more readable. Though, I see you've replaced trim with awk '{$1=$1};1', which is certainly interesting. As an example, all the sed calls when prompting the user for a selection could be condensed into a handful of sed calls, or into one awk call:
awk --field-separator ";" '
{
line = $3 ": " $5 " - " $7 " (" $9 ", " $11 ") [" $13 "]"
print gensub(/^\s*|\s*$/, "", "G", line)
}
' "/tmp/libgen.txt" | \
recode -q html..ascii | \
fzf | \
cut -d':' -f1You can do something similar with sed, to be fair; I just find awk a bit easier to read through.
I know you've done so for LIBBY_OUTPUT_DIRECTORY, but it may be worth prepending LIBBY_ for both $USB & $KINDLE_EMAIL to prevent potential overlap with other utilities that rely on the same naming.
Likewise, I'd make sure those two & LIBBY_VIEWER are documented in your README, so the user knows they can be set.
Unless you explicitly need the --continue flag from get and you will not know an offset for the curl equivalent (--continue-at <offset>), I wouldn't use wget in this script at all. It seems to me you could just use curl and get the same effect as wget -cO - "$mirror" > "/path/to/out".
While it is a preference thing, its very difficult to keep track of what your script is doing at any given point, especially because almost all of the code is one, giant function. I'd recommend seeing what parts of the download() function can be abstracted out to their own function. For example:
# $1: trr, $2: td type; $3 bookdata
getBookField() {
local ln pupFilter type
pupFilter="tr:nth-of-type(${1}) td:nth-of-type(N)"
case "${2,,}" in
title ) type="1";;
author ) type="2";;
publisher ) type="3";;
year ) type="4";;
language ) type="5";;
size ) type="7";;
filetype ) type="8";;
mirror ) type="9";;
* )
printf '%s\n' "Ignoring unknown filter type: '${2,,}'" >&2
return 1
;;
esac
pupFilter="${pupFilter/N/${type}}"
case "${type}" in
1 | 2 )
(($1 == 1)) && ln="8"
pup "${pupFilter} text{}" <<< "${3}" | \
sed '/^\s*$/d' | \
awk --assign "num=${ln:-1}" '
NR == num {
print gensub(/^\s*|\s*$/, "", "G", $0)
}
'
;;
9 )
pup "${pupFilter} a:nth-of-type(${mirrornum}) attr{href}" <<< "${3}" | \
sed 's/^\s*//;s/\s*$//'
;;
* )
pup "${pupFilter} text{} | \
sed 's/^\s*//;s/\s*$//'
;;
esac
}Which would make the book<Field> definitions a little easier to read:
bookTitle="$(getBookField "${trr}" "title" "${bookdata}")"
bookAuthor="$(getBookField "${trr}" "author" "${bookdata}")"
[...]May also be worth looking in an array to store the book data.
I'd greatly recommend using [[...]] for simple comparisons in bash over [...]. See here for a better explanation than I can give in a reddit comment.
Secondly, while I'm not immediately pulling documentation for you, you can also perform arithmetic operations & comparisons using ((...)) syntax in bash. This syntax can also be used for C-Style for loops, which is quite handy.
((tousb == 1)) # True if and only if 'tousb' is set to '1'
((tousb != 1)) # True if 'tousb' is set to anything other than '1', *including* nothing
((tousb)) # True if 'tousb' is set to any non-zero valueFor a quick for-loop example, assuming the getBookField() function above is defined:
# $* bookdata
parseQueryResult() {
local i j idx
local -a line
idx="1"
for ((i = 1; i > 0; i++)); do
line[0]="$(getBookField "${i}" "1" "${*}")"
[[ -n "${line[0]}" ]] || return
for j in {2..5} {7..9}; do
line+=("$(getBookField "${i}" "${j}" "${*}")"
done
if ! [[ "${line[6],,}" =~ (epub|pdf) ]]; then
((idx++))
unset line
continue
fi
printf '%s;' ";0" "${idx}"
for ((j = 0; j < ${#line[@]}; j++)); do
printf '%s;' "$((j + 1))" "${line[${j}]}"
done
printf '%s\n' ""
unset line
((idx++))
done
}