The alpha channel "disappears" be... python \n",
+ "4 You need to specify the index ... python "
+ ]
+ },
+ "execution_count": 19,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "so_df = pd.read_csv('so_database_app.csv')\n",
+ "so_df.head()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "id": "7a413056-0ec7-4137-a191-814e4797e0a2",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "import pickle"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "id": "c7836318-0148-456d-b29e-4a26aa4f8221",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [],
+ "source": [
+ "with open('question_embeddings_app.pkl', 'rb') as file:\n",
+ " question_embeddings = pickle.load(file)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "id": "7a4134a8-76d3-4677-b180-f8477b1a2c15",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Shape: (2000, 768)\n",
+ "[[-0.03571156 -0.00240684 0.05860338 ... -0.03100227 -0.00855574\n",
+ " -0.01997405]\n",
+ " [-0.02024316 -0.0026255 0.01940405 ... -0.02158143 -0.05655403\n",
+ " -0.01040497]\n",
+ " [-0.05175979 -0.03712264 0.02699278 ... -0.07055898 -0.0402537\n",
+ " 0.00092099]\n",
+ " ...\n",
+ " [-0.00580394 -0.01621097 0.05829635 ... -0.03350992 -0.05343556\n",
+ " -0.06016821]\n",
+ " [-0.00436622 -0.02692963 0.03363771 ... -0.01686567 -0.03812337\n",
+ " -0.02329491]\n",
+ " [-0.04240424 -0.01633749 0.05516777 ... -0.02697376 -0.01751165\n",
+ " -0.04558187]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Shape: \" + str(question_embeddings.shape))\n",
+ "print(question_embeddings)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3061eb7d-b69c-4eae-b583-67e9d7ca1f47",
+ "metadata": {},
+ "source": [
+ "#### Cluster the embeddings of the Stack Overflow questions"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "id": "a58b189d-61f7-4c68-8c73-769c59a23987",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.cluster import KMeans\n",
+ "from sklearn.decomposition import PCA"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "id": "dd2cd054-426e-4d92-9561-285a0b55792e",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "clustering_dataset = question_embeddings[:1000]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "id": "05421596-894c-4d53-9551-61839e72c0eb",
+ "metadata": {
+ "height": 83
+ },
+ "outputs": [],
+ "source": [
+ "n_clusters = 2\n",
+ "kmeans = KMeans(n_clusters=n_clusters, \n",
+ " random_state=0, \n",
+ " n_init = 'auto').fit(clustering_dataset)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "id": "7fac5a79-7639-4072-8c9d-86dc2e2c6658",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "kmeans_labels = kmeans.labels_"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "id": "210b0fbe-a5df-488f-9814-60f4304804d0",
+ "metadata": {
+ "height": 66
+ },
+ "outputs": [],
+ "source": [
+ "PCA_model = PCA(n_components=2)\n",
+ "PCA_model.fit(clustering_dataset)\n",
+ "new_values = PCA_model.transform(clustering_dataset)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "id": "6d2095e0-605f-4dd3-9786-2f313ff5c60a",
+ "metadata": {
+ "height": 66
+ },
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import mplcursors\n",
+ "%matplotlib ipympl"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "961b770b-53ea-4b49-bfc8-507cd711fc99",
+ "metadata": {
+ "height": 66
+ },
+ "outputs": [],
+ "source": [
+ "from utils import clusters_2D\n",
+ "clusters_2D(x_values = new_values[:,0], y_values = new_values[:,1], \n",
+ " labels = so_df[:1000], kmeans_labels = kmeans_labels)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b499cf9d-c739-4a34-b38d-e270fffe14f7",
+ "metadata": {},
+ "source": [
+ "- Clustering is able to identify two distinct clusters of HTML or Python related questions, without being given the category labels (HTML or Python)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "c46c217f-d121-4a14-a351-340a134db73f",
+ "metadata": {},
+ "source": [
+ "## Anomaly / Outlier detection\n",
+ "\n",
+ "- We can add an anomalous piece of text and check if the outlier (anomaly) detection algorithm (Isolation Forest) can identify it as an outlier (anomaly), based on its embedding."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "id": "8f7e3a9e-ec14-4f85-9eef-5766f85e0d0c",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.ensemble import IsolationForest"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "id": "96d872a2-5fbb-466b-ac80-84a1c6458006",
+ "metadata": {
+ "height": 83
+ },
+ "outputs": [],
+ "source": [
+ "input_text = \"\"\"I am making cookies but don't \n",
+ " remember the correct ingredient proportions. \n",
+ " I have been unable to find \n",
+ " anything on the web.\"\"\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 32,
+ "id": "ffb3b6f7-8cc8-4f3b-a19e-541956903e9e",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "emb = model.get_embeddings([input_text])[0].values"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 33,
+ "id": "b89da1d8-d17f-4ac7-a774-7fc0f67fba00",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [],
+ "source": [
+ "embeddings_l = question_embeddings.tolist()\n",
+ "embeddings_l.append(emb)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 34,
+ "id": "1f98eb7a-9a62-4d08-8744-5618cdf6d241",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "embeddings_array = np.array(embeddings_l)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 35,
+ "id": "b8a6eaa7-1067-4414-9e40-05352eb5ec03",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Shape: (2001, 768)\n",
+ "[[-0.03571156 -0.00240684 0.05860338 ... -0.03100227 -0.00855574\n",
+ " -0.01997405]\n",
+ " [-0.02024316 -0.0026255 0.01940405 ... -0.02158143 -0.05655403\n",
+ " -0.01040497]\n",
+ " [-0.05175979 -0.03712264 0.02699278 ... -0.07055898 -0.0402537\n",
+ " 0.00092099]\n",
+ " ...\n",
+ " [-0.00436622 -0.02692963 0.03363771 ... -0.01686567 -0.03812337\n",
+ " -0.02329491]\n",
+ " [-0.04240424 -0.01633749 0.05516777 ... -0.02697376 -0.01751165\n",
+ " -0.04558187]\n",
+ " [-0.00302366 -0.02049104 0.02172194 ... -0.04479321 -0.05254056\n",
+ " -0.00319716]]\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Shape: \" + str(embeddings_array.shape))\n",
+ "print(embeddings_array)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 36,
+ "id": "11467308-b48a-4341-a918-83940239762f",
+ "metadata": {
+ "height": 117
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " input_text | \n",
+ " output_text | \n",
+ " category | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 1996 | \n",
+ " Flip Clock code works on Codepen and doesn't w... | \n",
+ " <p>You forgot to attach the CSS file for the f... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 1997 | \n",
+ " React Native How can I put one view in front o... | \n",
+ " <p>You can do it using zIndex for example:</p>... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 1998 | \n",
+ " setting fixed width with 100% height of the pa... | \n",
+ " <p>You can use <code>width: calc(100% - 100px)... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 1999 | \n",
+ " How to make sidebar button not bring viewpoint... | \n",
+ " <p>It is quite simple, just remove that href=\"... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 2001 | \n",
+ " I am making cookies but don't \\n ... | \n",
+ " None | \n",
+ " baking | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " input_text \\\n",
+ "1996 Flip Clock code works on Codepen and doesn't w... \n",
+ "1997 React Native How can I put one view in front o... \n",
+ "1998 setting fixed width with 100% height of the pa... \n",
+ "1999 How to make sidebar button not bring viewpoint... \n",
+ "2001 I am making cookies but don't \\n ... \n",
+ "\n",
+ " output_text category \n",
+ "1996 You forgot to attach the CSS file for the f... css \n",
+ "1997
You can do it using zIndex for example:
... css \n",
+ "1998 You can use width: calc(100% - 100px)... css \n",
+ "1999 It is quite simple, just remove that href=\"... css \n",
+ "2001 None baking "
+ ]
+ },
+ "execution_count": 36,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Add the outlier text to the end of the stack overflow dataframe\n",
+ "so_df = pd.read_csv('so_database_app.csv')\n",
+ "new_row = pd.Series([input_text, None, \"baking\"], \n",
+ " index=so_df.columns)\n",
+ "so_df.loc[len(so_df)+1] = new_row\n",
+ "so_df.tail()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "da8b00f6-bc36-4e07-87af-043609e273c4",
+ "metadata": {},
+ "source": [
+ "#### Use Isolation Forest to identify potential outliers\n",
+ "\n",
+ "- `IsolationForest` classifier will predict `-1` for potential outliers, and `1` for non-outliers.\n",
+ "- You can inspect the rows that were predicted to be potential outliers and verify that the question about baking is predicted to be an outlier."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 37,
+ "id": "36815aaf-fa2a-4e0f-98d0-07c334a79887",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [],
+ "source": [
+ "clf = IsolationForest(contamination=0.005, \n",
+ " random_state = 2) "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 38,
+ "id": "5781dcd9-cdce-4307-ba49-dccf9cd7646e",
+ "metadata": {
+ "height": 66
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "2001 predictions. Set of possible values: {1, -1}\n"
+ ]
+ }
+ ],
+ "source": [
+ "preds = clf.fit_predict(embeddings_array)\n",
+ "\n",
+ "print(f\"{len(preds)} predictions. Set of possible values: {set(preds)}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 39,
+ "id": "483e753a-5291-423b-a7a3-c36af012d3f4",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " input_text | \n",
+ " output_text | \n",
+ " category | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 203 | \n",
+ " extract channel names from a multi-channel ima... | \n",
+ " <p>PerkinElmer QPI metadata are stored as XML ... | \n",
+ " python | \n",
+ "
\n",
+ " \n",
+ " | 1018 | \n",
+ " ASP .NET - JSON Serializer not working on clas... | \n",
+ " <p>Ok, I forgot to add default <code>{ get; se... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 1138 | \n",
+ " parse year and month from a string SQL BigQuer... | \n",
+ " <p>How about using string operations?</p>\\n<pr... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 1313 | \n",
+ " Array initialization with ternary operator in ... | \n",
+ " <p>To make your code work, do the following in... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 1358 | \n",
+ " How to represent 2 Entity with 2 Relation in E... | \n",
+ " <p><a href=\"https://i.stack.imgur.com/BJxBP.pn... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 1403 | \n",
+ " Apache ignite Partition Map Exchange , Baselin... | \n",
+ " <p>Long story short, these topics are about da... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 1427 | \n",
+ " Shortcut to reveal in Finder for currently ope... | \n",
+ " <p>No. It is not present but we can add it. Go... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 1493 | \n",
+ " How to change id of datatable?<p>I have some w... | \n",
+ " <p>In short - you can't. But maybe you can:</p... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 1498 | \n",
+ " What’s the difference between Next.js rewrites... | \n",
+ " <p><code>rewrites</code> are a convenient way ... | \n",
+ " r | \n",
+ "
\n",
+ " \n",
+ " | 2001 | \n",
+ " I am making cookies but don't \\n ... | \n",
+ " None | \n",
+ " baking | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " input_text \\\n",
+ "203 extract channel names from a multi-channel ima... \n",
+ "1018 ASP .NET - JSON Serializer not working on clas... \n",
+ "1138 parse year and month from a string SQL BigQuer... \n",
+ "1313 Array initialization with ternary operator in ... \n",
+ "1358 How to represent 2 Entity with 2 Relation in E... \n",
+ "1403 Apache ignite Partition Map Exchange , Baselin... \n",
+ "1427 Shortcut to reveal in Finder for currently ope... \n",
+ "1493 How to change id of datatable?I have some w... \n",
+ "1498 What’s the difference between Next.js rewrites... \n",
+ "2001 I am making cookies but don't \\n ... \n",
+ "\n",
+ " output_text category \n",
+ "203
PerkinElmer QPI metadata are stored as XML ... python \n",
+ "1018
Ok, I forgot to add default { get; se... r \n",
+ "1138 How about using string operations?
\\nTo make your code work, do the following in... r \n",
+ "1358 Long story short, these topics are about da... r \n",
+ "1427 No. It is not present but we can add it. Go... r \n",
+ "1493
In short - you can't. But maybe you can:rewrites are a convenient way ... r \n",
+ "2001 None baking "
+ ]
+ },
+ "execution_count": 39,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "so_df.loc[preds == -1]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4237f497-489e-448c-bf52-781a31d28558",
+ "metadata": {},
+ "source": [
+ "#### Remove the outlier about baking"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 40,
+ "id": "e8da293f-7945-4686-93fb-0a62269fed07",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "so_df = so_df.drop(so_df.index[-1])"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 41,
+ "id": "869a7b82-a813-474d-98d3-d740a2e8e6c5",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " input_text | \n",
+ " output_text | \n",
+ " category | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " python's inspect.getfile returns \"<string>\"<p>... | \n",
+ " <p><code><string></code> means that the ... | \n",
+ " python | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Passing parameter to function while multithrea... | \n",
+ " <p>Try this and note the difference:</p>\\n<pre... | \n",
+ " python | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " How do we test a specific method written in a ... | \n",
+ " <p>Duplicate of <a href=\"https://stackoverflow... | \n",
+ " python | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " how can i remove the black bg color of an imag... | \n",
+ " <p>The alpha channel "disappears" be... | \n",
+ " python | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " How to extract each sheet within an Excel file... | \n",
+ " <p>You need to specify the <code>index</code> ... | \n",
+ " python | \n",
+ "
\n",
+ " \n",
+ " | ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ "
\n",
+ " \n",
+ " | 1995 | \n",
+ " Is it possible to made inline-block elements l... | \n",
+ " <p>If this is only for the visual purpose then... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 1996 | \n",
+ " Flip Clock code works on Codepen and doesn't w... | \n",
+ " <p>You forgot to attach the CSS file for the f... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 1997 | \n",
+ " React Native How can I put one view in front o... | \n",
+ " <p>You can do it using zIndex for example:</p>... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 1998 | \n",
+ " setting fixed width with 100% height of the pa... | \n",
+ " <p>You can use <code>width: calc(100% - 100px)... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ " | 1999 | \n",
+ " How to make sidebar button not bring viewpoint... | \n",
+ " <p>It is quite simple, just remove that href=\"... | \n",
+ " css | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
2000 rows × 3 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " input_text \\\n",
+ "0 python's inspect.getfile returns \"\"... \n",
+ "1 Passing parameter to function while multithrea... \n",
+ "2 How do we test a specific method written in a ... \n",
+ "3 how can i remove the black bg color of an imag... \n",
+ "4 How to extract each sheet within an Excel file... \n",
+ "... ... \n",
+ "1995 Is it possible to made inline-block elements l... \n",
+ "1996 Flip Clock code works on Codepen and doesn't w... \n",
+ "1997 React Native How can I put one view in front o... \n",
+ "1998 setting fixed width with 100% height of the pa... \n",
+ "1999 How to make sidebar button not bring viewpoint... \n",
+ "\n",
+ " output_text category \n",
+ "0
<string> means that the ... python \n",
+ "1
Try this and note the difference:
\\nDuplicate of The alpha channel "disappears" be... python \n",
+ "4 You need to specify the index ... python \n",
+ "... ... ... \n",
+ "1995
If this is only for the visual purpose then... css \n",
+ "1996
You forgot to attach the CSS file for the f... css \n",
+ "1997
You can do it using zIndex for example:
... css \n",
+ "1998 You can use width: calc(100% - 100px)... css \n",
+ "1999 It is quite simple, just remove that href=\"... css \n",
+ "\n",
+ "[2000 rows x 3 columns]"
+ ]
+ },
+ "execution_count": 41,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "so_df"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a14167fd-21f5-4929-9ec3-ff85a9a4de9e",
+ "metadata": {},
+ "source": [
+ "## Classification\n",
+ "- Train a random forest model to classify the category of a Stack Overflow question (as either Python, R, HTML or CSS)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 42,
+ "id": "319bf0c2-92d1-403b-acd9-919a3e21033f",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.ensemble import RandomForestClassifier"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 43,
+ "id": "6dd46eb2-445a-4e1f-97a0-8c995bfaf892",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [],
+ "source": [
+ "from sklearn.metrics import accuracy_score\n",
+ "from sklearn.model_selection import train_test_split"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 44,
+ "id": "c6a79406-57bb-4480-9184-1ba9bebd5c36",
+ "metadata": {
+ "height": 83
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(2000, 768)"
+ ]
+ },
+ "execution_count": 44,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# re-load the dataset from file\n",
+ "so_df = pd.read_csv('so_database_app.csv')\n",
+ "X = question_embeddings\n",
+ "X.shape"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 45,
+ "id": "10826eca-8fa0-4ce7-9b93-aa3e8fb19e2e",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(2000,)"
+ ]
+ },
+ "execution_count": 45,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "y = so_df['category'].values\n",
+ "y.shape"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 46,
+ "id": "a6a3ac12-b97a-4345-aea4-498de0babb38",
+ "metadata": {
+ "height": 83
+ },
+ "outputs": [],
+ "source": [
+ "X_train, X_test, y_train, y_test = train_test_split(X, \n",
+ " y, \n",
+ " test_size = 0.2, \n",
+ " random_state = 2)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 47,
+ "id": "b65f8db9-7c39-42bd-ac01-8217688454ec",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "clf = RandomForestClassifier(n_estimators=200)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 48,
+ "id": "1e6a0d25-9708-41b8-a754-bdccf5b94b66",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
RandomForestClassifier(n_estimators=200)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org. "
+ ],
+ "text/plain": [
+ "RandomForestClassifier(n_estimators=200)"
+ ]
+ },
+ "execution_count": 48,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "clf.fit(X_train, y_train)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7d6eda9b-25f5-4cbd-8ab1-462361f48a21",
+ "metadata": {},
+ "source": [
+ "#### You can check the predictions on a few questions from the test set"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 49,
+ "id": "9e134eb4-b3cd-425a-81dd-e2a96362c782",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": [
+ "y_pred = clf.predict(X_test)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 50,
+ "id": "0805c099-1349-45a0-b1bc-47b64b15495a",
+ "metadata": {
+ "height": 49
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Accuracy: 0.7025\n"
+ ]
+ }
+ ],
+ "source": [
+ "accuracy = accuracy_score(y_test, y_pred) # compute accuracy\n",
+ "print(\"Accuracy:\", accuracy)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6ae498a1-26ee-4111-90ba-b625804353de",
+ "metadata": {},
+ "source": [
+ "#### Try out the classifier on some questions"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 51,
+ "id": "1641137e-f556-443b-b67a-7468229ce7ee",
+ "metadata": {
+ "height": 253
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "For question 2, the prediction is `python`\n",
+ "The actual label is `python`\n",
+ "The question text is:\n",
+ "--------------------------------------------------\n",
+ "How do we test a specific method written in a list of files for functional testing in pythonThe project has so many modules. There are functional test cases being written for almost every api written like for GET requests, POST requests and PUT requests. To test an individual file we use the syntact pytest tests/file_name.py\n",
+ "but I want to test a specific method in that file. Is there any way to test it like that??
\n"
+ ]
+ }
+ ],
+ "source": [
+ "# choose a number between 0 and 1999\n",
+ "i = 2\n",
+ "label = so_df.loc[i,'category']\n",
+ "question = so_df.loc[i,'input_text']\n",
+ "\n",
+ "# get the embedding of this question and predict its category\n",
+ "question_embedding = model.get_embeddings([question])[0].values\n",
+ "pred = clf.predict([question_embedding])\n",
+ "\n",
+ "print(f\"For question {i}, the prediction is `{pred[0]}`\")\n",
+ "print(f\"The actual label is `{label}`\")\n",
+ "print(\"The question text is:\")\n",
+ "print(\"-\"*50)\n",
+ "print(question)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "932aa937-9b26-4e5e-8242-001cc2794a75",
+ "metadata": {
+ "height": 32
+ },
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/G-DeepLearning.AI/Q-Text-Embedding-GoogleCloud/L5-Applications/question_embeddings_app.pkl b/G-DeepLearning.AI/Q-Text-Embedding-GoogleCloud/L5-Applications/question_embeddings_app.pkl
new file mode 100644
index 0000000..89a3ae0
Binary files /dev/null and b/G-DeepLearning.AI/Q-Text-Embedding-GoogleCloud/L5-Applications/question_embeddings_app.pkl differ
diff --git a/G-DeepLearning.AI/Q-Text-Embedding-GoogleCloud/L5-Applications/so_database_app.csv b/G-DeepLearning.AI/Q-Text-Embedding-GoogleCloud/L5-Applications/so_database_app.csv
new file mode 100644
index 0000000..494a63e
--- /dev/null
+++ b/G-DeepLearning.AI/Q-Text-Embedding-GoogleCloud/L5-Applications/so_database_app.csv
@@ -0,0 +1,156031 @@
+input_text,output_text,category
+"python's inspect.getfile returns """"Consider this code:
+from sqlalchemy import exists
+import inspect
+
+print(inspect.getfile(exists))
+# Effectively calls:
+print(exists.__code__.co_filename)
+
+On 2 systems I've tested it on it prints:
+<string>
+<string>
+
+What does it mean? Could anything be done to get a proper filepath?
","<string> means that the function was defined dynamically by executing a string, rather than being defined in the text of a file. You can see this if you do:
+exec('def foo(): return 1')
+print(inspect.getfile(foo))
+
+I'm not sure why sqlAlchemy needs to define exists() this way. But I don't think there's any way to get the source file that does it.
",python
+"Passing parameter to function while multithreadingI'm learning about multithreading and when I try to pass a parameter to my function in each thread it will process sequentially. Why is that?
+import time
+import threading
+
+start = time.perf_counter()
+
+def sleepy_duck(name):
+ print(name, "duck going to sleep 1 sec")
+ time.sleep(1)
+ print(name, "waking up")
+
+
+t1 = threading.Thread(target=sleepy_duck("Johny"))
+t2 = threading.Thread(target=sleepy_duck("Dicky"))
+t3 = threading.Thread(target=sleepy_duck("Loly"))
+
+t1.start()
+t2.start()
+t3.start()
+
+t1.join()
+t2.join()
+t3.join()
+
+finish = time.perf_counter()
+print("The ducks slept ", finish-start, " seconds.")
+
+Result:
+Johny duck going to sleep 1 sec
+Johny waking up
+Dicky duck going to sleep 1 sec
+Dicky waking up
+Loly duck going to sleep 1 sec
+Loly waking up
+The ducks slept 3.0227753 seconds.
+
","Try this and note the difference:
+import time
+import threading
+
+start = time.perf_counter()
+
+def sleepy_duck(name):
+ print(name, "duck going to sleep 1 sec")
+ time.sleep(1)
+ print(name, "waking up")
+
+
+t1 = threading.Thread(target=sleepy_duck, args=("Johnny",))
+t2 = threading.Thread(target=sleepy_duck, args=("Dicky",))
+t3 = threading.Thread(target=sleepy_duck, args=("Loly",))
+
+t1.start()
+t2.start()
+t3.start()
+
+t1.join()
+t2.join()
+t3.join()
+
+finish = time.perf_counter()
+print("The ducks slept ", finish-start, " seconds.")
+
+The way you did it originally the target was None
",python
+"How do we test a specific method written in a list of files for functional testing in pythonThe project has so many modules. There are functional test cases being written for almost every api written like for GET requests, POST requests and PUT requests. To test an individual file we use the syntact pytest tests/file_name.py
+but I want to test a specific method in that file. Is there any way to test it like that??
","Duplicate of Is there a way to specify which pytest tests to run from a file?
+In a few words, you can use the -k option of pytest to specify the name of the test you would like to run.
",python
+"how can i remove the black bg color of an image with no alpha layerSo i started doing a game to pass the time and i don't know how to solve this issue:
+my player is a part of a sprite sheet, the sprite sheet has a alpha layer so its transparent but when i divide my sprite sheet into small sprite, this alpha layer disapear and i have a black bg instead... I tried using set_colorkey([0, 0, 0]) to remove the black bg, but beceause my player is dark skinned, my player partially disappear. Any suggestion?
+import pygame
+
+
+class Player(pygame.sprite.Sprite):
+ def __init__(self):
+ super().__init__()
+ self.image = pygame.image.load("assets/img/plr.png")
+ self.image = self.get_image(0, 0)
+ self.image = pygame.transform.scale(self.image, (96, 96))
+ self.image.set_colorkey([0, 0, 0])
+ self.rect = self.image.get_rect(center=(250, 250))
+
+ def get_image(self, x, y):
+ image = pygame.Surface([48, 48])
+ image.blit(self.image, (0, 0), (x, y, 48, 48))
+ return image
+
+ def update(self):
+ pass
+
","The alpha channel "disappears" because you create a Surface without alpha channel (RGB). You have to use the SRCALPHA flag to create a Surface with an alpha channel (RGBA). Also see pygame.Surface:
+image = pygame.Surface([48, 48])
+image = pygame.Surface([48, 48], pygame.SRCALPHA)
+
+For a surface without alpha channels you can set the transparent color key with set_colorkey:
+image = pygame.Surface([48, 48])
+image.set_colorkey((0, 0, 0))
+
",python
+"How to extract each sheet within an Excel file into an individual csv file as it is without appending any column or row?
I'm trying to extract each sheet in an Excel file into multiple CSVs.
+For example Sheet1, Sheet2 and Sheet3 from Sample_1.xlsx into Sheet1.csv, Sheet2.csv and Sheet3.csv
+The code I run is as below :
+import pandas as pd
+
+dfs = pd.read_excel('Sample_1.xlsx', sheet_name=None)
+
+for sheet_name, data in dfs.items():
+ data.to_csv(f"{sheet_name}.csv")
+
+
+It outputs all of the three CSVs as desired but each csv has an extra column (column A) with index starting 0,1,2..n . Why is that happening? and How do I get rid of it?
","You need to specify the index parameter of pandas.DataFrame.to_csv as False
+Replace :
+data.to_csv(f"{sheet_name}.csv")
+
+By :
+data.to_csv(f"{sheet_name}.csv", index=False)
+
",python
+"Screen Manager set/get TextInput when button pressedAll I'm trying to do is get and/or set the input from the TextInput: inside the GridLayout, inside the SeccondWindow. (id: "text_input") But after googling, and googling... all my attempts have failed.
+I have tried
+main_screen = self.manager.get_screen('second')
+main_screen.ids.text_input.text = "Something..."
+
+Only to get the error "AttributeError: 'super' object has no attribute 'getattr'"
+I've tried
+text = self.root.ids["text_input"].text
+
+But I get the error "AttributeError: 'SecondWindow' object has no attribute 'root'"
+I've tried loads of things...
+I'M GOING INSANE!! (And I'm dumb so please help!)
+Here's my new_window.kv file
+WindowManager:
+ transition: NoTransition()
+ FirstWindow:
+ SecondWindow:
+
+
+<FirstWindow>:
+ name: "first"
+
+ BoxLayout:
+ orientation: "vertical"
+ size: root.width, root.height
+
+ Button:
+ text: "Go To Next Screen"
+ on_release:
+ app.root.current = "second"
+
+
+<SecondWindow>:
+ name: "second"
+ GridLayout:
+ id: "Container"
+ cols: 2
+ rows: 1
+
+ ScrollView:
+ id: "SideMenuScrollView"
+ size_hint: ("0.3dp", 1)
+ do_scroll_y: True
+ do_scroll_x: False
+
+ StackLayout:
+ id: "SideMenuStack"
+ size_hint_y: None
+ height: self.minimum_height
+
+ Button:
+ size_hint: (None, None)
+ size: ("92dp", "92dp")
+
+ Button:
+ size_hint: (None, None)
+ size: ("92dp", "92dp")
+
+
+ GridLayout:
+ size_hint: (1, 1)
+ id: "MyGrid"
+ size: (1, 1)
+ spacing: 10
+ padding: 10
+ cols: 1
+ rows: 2
+
+ TextInput:
+ id: "text_input"
+ multiline: False
+ text: ""
+ size_hint: (1, None)
+ height: "30dp"
+
+
+ Button:
+ text: "Do Stuff"
+ on_release: root.DoStuff()
+ size_hint: (1, None)
+ height: "70dp"
+
+Here's my code
+from kivy.app import App
+from kivy.uix.screenmana.........
+
+class WindowManager(ScreenManager):
+ pass
+
+
+class FirstWindow(Screen):
+ pass
+
+
+class SecondWindow(Screen):
+ def DoStuff(self):
+
+ #
+ # text_input.text = "what ever..."
+ #
+
+
+kv = Builder.load_file('new_window.kv')
+
+
+class AwesomeApp(App):
+ def build(self):
+ return kv
+
+
+if __name__ == "__main__":
+ AwesomeApp().run()
+
+
+Does anyone know how to get/set the TextInput .text value and DoStuff with it??
","The main problem is in your kv file:
+ TextInput:
+ id: "text_input"
+
+If you define an id with enclosing ", those " become part of the id. You can do that, but it complicates accessing that id. A simpler approach is to just eliminate the ", like this:
+ TextInput:
+ id: text_input
+
+Then you can access the TextInput as:
+class SecondWindow(Screen):
+ def DoStuff(self):
+ self.ids.text_input.text = 'Abba'
+
",python
+"Group by, summarise and divide by the number of distinct months using PythonAssume I have a following data:
+id date value
+1 2020-01-22 20
+1 2020-03-12 18
+1 2020-03-25 16
+2 2020-04-22 20
+2 2020-04-22 23
+
+First I wish group by id and date and sum values for distinct dates. Then, I want to group by id to sum the total value and divide by the count of distinct months from date.
+The first part is easy. I can simply do: df.groupby(["id", "date"]).sum(). I then get the following:
+ value
+id date
+1 2020-01-22 20
+1 2020-03-12 18
+1 2020-03-25 16
+2 2020-04-22 43
+
+But I do not only want to get the aggregate but the sum being divided by the number of unique months in the date. My idea for counting the unique months would be: len(pd.to_datetime(df["date"]).dt.to_period('M').unique()). However, I have no idea how to combine the two together.
+Basically, the output I'm looking for is:
+id value_after_division
+1 27
+2 43
+
+In simpler terms: 27=(20+18+16)/2 and 43=(43)/1.
","You'd like to aggregate "number of unique months" and "total value", and divide them. You already had the latter part. For the former, if only we had a (temporary) column indicating the month. So we go:
+# get hold on a grouper object after making month available
+g = df.assign(month=df.date.dt.month).groupby("id")
+
+# aggregate
+nuniq_mon = g["month"].nunique()
+total_val = g["value"].sum()
+
+# div is method way of /
+result = total_val.div(nuniq_mon)
+
+to get
+>>> result
+
+id
+1 27.0
+2 43.0
+dtype: float64
+
",python
+"Why is text inside HTML tags getting translated when requested while Web Scraping?I am learning a little bit about web scraping and currently i am trying to do a small project. So with this code I am storing the HTML code inside soup variable.
+source=requests.get(URL)
+soup=BeautifulSoup(source.text,'html.parser')
+
+The problem is: when I inspect the code inside my browser it looks like this:
+<a ...>The Godfather</a>
+
+but when I try to use it in my program only the text inside tag (The Godfather) gets translated to my native language (Кум):
+<a ...>Кум</a>
+
+I dont want it to get translated.
+My browser is completely in English and I have no idea why is this happening. Any help would be much appreciated!
","Try to specify Accept-Language HTTP header in your request:
+import requests
+from bs4 import BeautifulSoup
+
+
+url = "https://www.imdb.com/search/title/?groups=top_100&sort=user_rating,desc"
+
+headers = {"Accept-Language": "en-US,en;q=0.5"}
+soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
+
+
+for h3 in soup.select("h3"):
+ print(h3.get_text(strip=True, separator=" "))
+
+Prints:
+1. The Shawshank Redemption (1994)
+2. The Godfather (1972)
+3. The Dark Knight (2008)
+4. The Lord of the Rings: The Return of the King (2003)
+5. Schindler's List (1993)
+6. The Godfather Part II (1974)
+
+...
+
",python
+"Python pandas df: if col_A contains a string from a list, append the string to col_BI have a list of about 60 words and a data frame of over 7000 sentences. I would like to add a column for each line that contains a word from the list (some have multiple words from the list and some repeat words). I've tried a bunch of ways to no avail. Starting off with
+list=['a', 'b', 'c', ...]
+
+ for x in list:
+ if df["col_A].str.contains(x):
+ df[col_B].append(x)
+
+but it returns an error
+
+ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
+
+any help on how to fix this or possibly other ways to approach this would be much appreciated. New to stackoverflow and a only been coding for a few months if that helps gauge where I am at in terms of knowlage.
+Edit: id like the new column to be of the words from the list that appear within the string from the existing one.
+Also if it helps:its a movie script im trying to analyze where each row is a line from the script and who said it. I want to analyze how many times a word is said and who had favorite words.
","try this:
+list_words=["a",....]
+def add_list_words(sentences):
+ returned_list=[]
+ for word in list_words:
+ if word in sentences:
+ returned_list.append(word)
+ return list(set(returned_list)) # the set use to drop duplicate
+df["col_B"]=df["col_A"].apply(add_list_words)
+
",python
+"Python how to add two maticies with different number of rows and columns?Suppose I have two 2D matricies: [[1,2],[0,0]] and [[4,8],[0,0],[5,6]] and I want to sum them up and get the matrix [[5,10],[0,0],[5,6]]. How can I do it in the most easiest way without writing my own function? As I understood, numpy matricies don't support this.
+import numpy as np
+m1 = np.array([[1, 2], [0, 0]])
+m2 = np.array([[4, 8], [0, 0], [5, 6]])
+res = m1+m2
+
","import numpy as np
+
+m1 = np.array([[1, 2], [0, 0]])
+m2 = np.array([[4, 8], [0, 0], [5, 6]])
+
+# Array size matching
+m1.resize(m2.shape) if m1.shape < m2.shape else m2.resize(m1.shape)
+
+answer = m1 + m2
+print(answer)
+
+Output:
+[[ 5 10]
+ [ 0 0]
+ [ 5 6]]
+
",python
+"Beginner in Python: Trying to check for identical items in multiple listsI'm a beginner trying to create a small Python project to help me find out which ingredient I might be allergic to from a group of four products. Ideally, I would like to input the ingredients of each product into a list format, and then return identical ingredients which appear in two or more of the products.
+I'd like to input something like:
+product_1 = ["polybutene", "cranberry seed oil", "beeswax"]
+product_2 = ["vegetable oil", "beeswax", "shea butter"]
+
+and get a result like:
+list1 = ["beeswax"]
+
+I've tried to use compare_intersect but haven't made any progress. Thank you very much in advance!
","If you want a list out you can do
+list(set(product_1).intersection(set(product_2)))
+
+Factored out to make each operation clear this looks like
+set_1 = set(product_1)
+set_2 = set(product_2)
+
+intersection = set_1.intersection(set_2)
+
+result = list(intersection)
+
+If you want to preserve duplicates you can implement this efficiently like so.
+from collections import Counter
+def list_intersection(l1, l2):
+ c2 = Counter(l2)
+ result = []
+ for element in l1:
+ if c2[element] > 0: # There are still more of those elements in the other list
+ # The element is shared between the lists
+ result.append(element)
+ c2[element] -= 1
+ return result
+
+This will preserve the order of the first list, while only keeping the number of elements of the same type shared between the lists. Here's an example
+list_intersection(['beeswax', 'test', 'other', 'beeswax', 'beeswax'],
+ ['frank', 'green', 'test', 'beeswax', 'beeswax'])
+
+
+prints: ['beeswax', 'test', 'beeswax']
+
",python
+"GroupBy results to list of dictionaries, Using the grouped by object in itMy DataFrame looks like so:
+Date Column1 Column2
+1.1 A 1
+1.1 B 3
+1.1 C 4
+2.1 A 2
+2.1 B 3
+2.1 C 5
+3.1 A 1
+3.1 B 2
+3.1 C 2
+
+And I'm looking to group it by Date and extract that data to a list of dictionaries so it appears like this:
+[
+ {
+ "Date": "1.1",
+ "A": 1,
+ "B": 3,
+ "C": 4
+ },
+ {
+ "Date": "2.1",
+ "A": 2,
+ "B": 3,
+ "C": 5
+ },
+ {
+ "Date": "3.1",
+ "A": 1,
+ "B": 2,
+ "C": 2
+ }
+]
+
+This is my code so far:
+df.groupby('Date')['Column1', 'Column2'].apply(lambda g: {k, v for k, v in g.values}).to_list()
+
+Using this method can't use my grouped by objects in the apply method itself:
+[
+ {
+
+ "A": 1,
+ "B": 3,
+ "C": 4
+ },
+ {
+ "A": 2,
+ "B": 3,
+ "C": 5
+ },
+ {
+ "A": 1,
+ "B": 2,
+ "C": 2
+ }
+]
+
+Using to_dict() giving me the option to reach the grouped by object, but not to parse it to the way I need.
+Anyone familiar with some elegant way to solve it?
+Thanks!!
","You could first reshape your data using df.pivot, reset the index, and then apply to_dict to the new shape with the orient parameter set to "records". So:
+import pandas as pd
+
+data = {'Date': ['1.1', '1.1', '1.1', '2.1', '2.1', '2.1', '3.1', '3.1', '3.1'],
+ 'Column1': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
+ 'Column2': [1, 3, 4, 2, 3, 5, 1, 2, 2]}
+
+df = pd.DataFrame(data)
+
+df_pivot = df.pivot(index='Date',columns='Column1',values='Column2')\
+ .reset_index(drop=False)
+result = df_pivot.to_dict('records')
+
+target = [{'Date': '1.1', 'A': 1, 'B': 3, 'C': 4},
+ {'Date': '2.1', 'A': 2, 'B': 3, 'C': 5},
+ {'Date': '3.1', 'A': 1, 'B': 2, 'C': 2}]
+
+print(result == target)
+# True
+
",python
+"How to add/remove only specific blocks of XML to a new file?I'm trying to filter an XML such that only specific blocks of XML would be needed I have the original XML like this
+<PROJECT>
+ <BLOCKLIST>
+ <BLOCK>
+ <TASK>
+ <INSTALL_METHOD installer="TYPE 1" />
+ <FILE>
+ <INSTALL_OPTIONS option="signature"/>
+ <INSTALL_OPTIONS option="checksum"/>
+ </FILE>
+ </TASK>
+ <TASK>
+ <INSTALL_METHOD installer="TYPE 2" />
+ <FILE>
+ <INSTALL_OPTIONS option="signature"/>
+ <INSTALL_OPTIONS option="checksum"/>
+ </FILE>
+ </TASK>
+ <TASK>
+ <INSTALL_METHOD installer="TYPE 3" />
+ <FILE>
+ <INSTALL_OPTIONS option="signature"/>
+ <INSTALL_OPTIONS option="checksum"/>
+ </FILE>
+ </TASK>
+ <TASK>
+ <INSTALL_METHOD installer="TYPE 4" />
+ <FILE>
+ <INSTALL_OPTIONS option="signature"/>
+ <INSTALL_OPTIONS option="checksum"/>
+ </FILE>
+ </TASK>
+ </BLOCK>
+ </BLOCKLIST>
+</PROJECT>
+
+Now I need to compare <INSTALL_METHOD installer="x" /> and move the entire TASK block to a new file , so for example, if I want only TYPE 1 and TYPE 3 the new.xml should look something like this
+<PROJECT>
+<BLOCKLIST>
+ <BLOCK>
+ <TASK>
+ <INSTALL_METHOD installer="TYPE 1" />
+ <FILE>
+ <INSTALL_OPTIONS option="signature"/>
+ <INSTALL_OPTIONS option="checksum"/>
+ </FILE>
+ </TASK>
+ <TASK>
+ <INSTALL_METHOD installer="TYPE 3" />
+ <FILE>
+ <INSTALL_OPTIONS option="signature"/>
+ <INSTALL_OPTIONS option="checksum"/>
+ </FILE>
+ </TASK>
+ </BLOCK>
+</BLOCKLIST>
+</PROJECT>
+
+
+I tried the below approach, but I'm getting the following error ValueError: list.remove(x): x not in list.
+import xml.etree.ElementTree as ET
+
+tree = ET.parse("input.xml")
+root = tree.getroot()
+tasks = root.findall(".//BLOCKLIST/BLOCK/TASK")
+
+for task in tasks:
+ install_method = task.find("INSTALL_METHOD")
+ if not install_method.get("installer") in ["TYPE 1" , "TYPE 3"]:
+ root.remove(task)
+
+tree.write("new.xml", encoding='UTF-8', xml_declaration=True)
+
","The task is not a direct child of root, so you can't remove it like this. If you use lxml.etree instead of xml.etree.ElementTree you can get the task parent and use it to remove the task (the rest of the code is without change)
+import lxml.etree as ET
+
+...
+if not install_method.get("installer") in ["TYPE 1", "TYPE 3"]:
+ task.getparent().remove(task)
+...
+
",python
+"Hyperscience DatabaseAccessBlock - Test Connection FailingI’m leveraging a DatabaseAccessBlock on the Hyperscience application (v34.0.3) to communicate with an external database. I’m not able to see the query in the UI and the test connection fails. I'm not finding much information regarding debugging database queries in the SDK; how do I debug this?
+Here is my code snippet:
+def build_query(submission: Any) -> Any:
+ table_name = 'table'
+ id = '1'
+ query = 'SELECT * FROM ' + table_name + ' WHERE 1 = 1 AND id = ' + id
+ return query
+
+cct_build_query = CodeBlock(
+ reference_name='build_query',
+ code=build_query,
+ code_input={'submission': load_submission.output()},
+ title='Build Query',
+ description='Build Query',
+)
+
+db_lookup = DatabaseAccessBlock(
+ reference_name='db_lookup',
+ title='Database Lookup',
+ description='Perform database lookup',
+ db_type='postgres',
+ database='<database_name>',
+ host='<hostname>',
+ username='<username>',
+ password='<password>',
+ port = 5432,
+ timeout = 200,
+ query= cct_build_query.output(),
+)
+
","The Hyperscience UI does not show the query field if the query is dynamically drawing values from a previous block, but only when a static query has been defined within the DatabaseAccessBlock itself.
+To test a connection, set a static query value, test the connection, and then revert to the dynamically set value. It seems that while a query is required to test the connection, the query does not have to be valid - the system simply checks if there is a value present in the query field to go on to test the connection.
+You can change the db block to the following, for example:
+db_lookup = DatabaseAccessBlock(
+ reference_name='db_lookup',
+ title='Database Lookup',
+ description='Perform database lookup',
+ db_type='postgres',
+ database='<database_name>',
+ host='<hostname>',
+ username='<username>',
+ password='<password>',
+ port = 5432,
+ timeout = 200,
+ query= select * from random_table,
+)
+
",python
+"Why does pandas.DataFrame change the data source?I'm learning Python, and I found a thing I can't understand.
+I created a pandas.DataFrame from a ndarray, and then only modified the DF instead of ndarray.
+And to my suprise, the ndarray has changed too!
+Is the data cached inside DF?
+If yes, why does they changed inside ndarray?
+If no, how about a DF created without any source?
+from pandas import DataFrame
+import numpy as np
+
+if __name__ == '__main__':
+ nda1 = np.zeros((3,3), dtype=float)
+ print(f'original nda1:\n{nda1}\n')
+
+ df1 = DataFrame(nda1)
+ print(f'original df1:\n{df1}\n')
+
+ df1.iat[2,2] = 999
+ #print(f'df1 in main:\n{df}\n')
+ print(f'nda1 after modify:\n{nda1}\n')
+
","DataFrames are using numpy arrays under the hood. As you have a full homogeneous type, the array is kept as is.
+You can check it with:
+pd.DataFrame(nda1).values.base is nda1
+# True
+
+You can force a copy to avoid the issue:
+df1 = pd.DataFrame(nda1.copy())
+
+or copy from within the constructor:
+df1 = pd.DataFrame(nda1, copy=True)
+
+check that the underlying array is different:
+pd.DataFrame(nda1, copy=True).values.base is nda1
+# False
+
",python
+"Why the horizontal scroll bar does not show up in PYQT6 QtextEdit widgetIn the example below I am using PyQt6 to create text with long lines that exceed the QtextEdit widget's viewing size vertically and horizontally. The vertical scroll bar shows up however, the horizontal scroll bar does not show up. Any help with this issue is appreciated.
+import sys
+from PyQt6.QtWidgets import (QApplication, QMainWindow, QPushButton,
+ QWidget, QHBoxLayout, QVBoxLayout, QTextEdit)
+from PyQt6.QtCore import QSize
+
+class MainWindow(QMainWindow):
+ def __init__(self):
+ super().__init__()
+ self.setWindowTitle("Example")
+ self.setContentsMargins(20,20,20,20)
+ self.setFixedSize(QSize(800, 600))
+ self.setWindowTitle("Example")
+ layout1 = QHBoxLayout()
+ layout2 = QVBoxLayout()
+ self.output_text = QTextEdit()
+ self.button_start = QPushButton("Start")
+ self.button_start.clicked.connect(self.start)
+ layout1.addLayout(layout2)
+ layout2.addWidget(self.output_text)
+ layout2.addWidget(self.button_start)
+ widget = QWidget()
+ widget.setLayout(layout1)
+ self.setCentralWidget(widget)
+
+ def start(self):
+ for i in range (1, 30):
+ self.output_text.append("Line " + str(i) + ": This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. ")
+
+app = QApplication(sys.argv)
+window = MainWindow()
+window.show()
+app.exec()
+
","If you want a horizontal scrollbar, you will need to set the QTextEdit column width higher than the QTextEdit area. So let's say for example 1000 will work for your code.
+import sys
+from PyQt6.QtWidgets import (QApplication, QMainWindow, QPushButton,
+ QWidget, QHBoxLayout, QVBoxLayout, QTextEdit)
+from PyQt6.QtCore import QSize, Qt
+
+class MainWindow(QMainWindow):
+ def __init__(self):
+ super().__init__()
+ self.setWindowTitle("Example")
+ self.setContentsMargins(20,20,20,20)
+ self.setFixedSize(QSize(800, 600))
+ self.setWindowTitle("Example")
+ layout1 = QHBoxLayout()
+ layout2 = QVBoxLayout()
+ self.output_text = QTextEdit()
+ self.output_text.setLineWrapColumnOrWidth(1000) #Here you set the width you want
+ self.output_text.setLineWrapMode(QTextEdit.LineWrapMode.FixedPixelWidth)
+ self.button_start = QPushButton("Start")
+ self.button_start.clicked.connect(self.start)
+ layout1.addLayout(layout2)
+ layout2.addWidget(self.output_text)
+ layout2.addWidget(self.button_start)
+ widget = QWidget()
+ widget.setLayout(layout1)
+ self.setCentralWidget(widget)
+
+ def start(self):
+ for i in range (1, 30):
+ self.output_text.append("Line " + str(i) + ": This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. This is a long line. ")
+
+app = QApplication(sys.argv)
+window = MainWindow()
+window.show()
+app.exec()
+
+And here is the result:
+![]()
+Tell me if this wasn't what you were looking for or if it doesn't work for you.
",python
+"Convert 16 bit hex value to FP16 in Python?I'm trying to write a basic FP16 based calculator in python to help me debug some hardware. Can't seem to find how to convert 16b hex values unto floating point values I can use in my code to do the math. I see lots of online references to numpy but I think the float16 constructor expects a string like float16("1.2345"). I guess what I'm looking for is something like float16("0xabcd").
+Thanks!
","The numpy.float16 is indeed a signed floating point format with a 5-bit exponent and 10-bit mantissa.
+To get the result of your example:
+import numpy as np
+
+np.frombuffer(b'\xab\xcd', dtype=np.float16, count=1)
+
+Result:
+array([-22.67], dtype=float16)
+
+Or, to show how you can encode and decode the other example 1.2345:
+import numpy as np
+
+a = np.array([1.2345], numpy.float16)
+b = a.tobytes()
+print(b)
+c = np.frombuffer(b, dtype=np.float16, count=1)
+print(c)
+
+Result:
+b'\xf0<'
+[1.234]
+
+If you literally needed to turn the string you provided into an FP16:
+import numpy as np
+
+s = "0xabcd"
+b = int("0xabcd", base=16).to_bytes(2, 'big')
+print(b)
+c = np.frombuffer(b, dtype=np.float16, count=1)
+print(c)
+
+Output:
+b'\xab\xcd'
+[-22.67]
+
",python
+"Discord.py: Is it possible to use one cogs function into another cogs?Hello I just want to use the async module on the first cogs into another by so that the code wont be too long for a single file and maximize the code reuse:
+Module_cogs_1.py :
+class Music_base(commands.Cog):
+ def __init__(self, bot):
+ self.bot = bot
+
+ # this is the module that to be imported
+
+ async def insert_player_song(self, ctx, track, track_index):
+ player = self.bot.lavalink.player_manager.get(ctx.guild.id)
+ player.add(requester=ctx.author.id, track=track, index=track_index)
+
+
+Module_cogs_2.py :
+class Module_test(commands.Cog):
+ def __init__(self, bot):
+ self.bot = bot
+ @commands.command(aliases=['test'])
+ async def test_track(self, ctx, track_index :typing.Optional[int]=0, *, query:str):
+ # accessing the async module from Module_cogs_1.py
+ await Music_base.insert_player_song(self, ctx, track, track_index)
+
+
+If imported locally i can used as await self.insert_player_song() but when i try to import outside or in second file without self on the argument await Music_base.insert_player_song(ctx, track, track_index):
+Traceback (most recent call last):
+ File "/home/vee/sauce-v3x/venv/lib/python3.9/site-packages/discord/ext/commands/bot.py", line 939, in invoke
+ await ctx.command.invoke(ctx)
+ File "/home/vee/sauce-v3x/venv/lib/python3.9/site-packages/discord/ext/commands/core.py", line 863, in invoke
+ await injected(*ctx.args, **ctx.kwargs)
+ File "/home/vee/sauce-v3x/venv/lib/python3.9/site-packages/discord/ext/commands/core.py", line 94, in wrapped
+ raise CommandInvokeError(exc) from exc
+discord.ext.commands.errors.CommandInvokeError: Command raised an exception: TypeError: insert_player_song() missing 1 required positional argument: 'track_index'
+
+
+But when adding the self onto the argument await Music_base.insert_player_song(self, ctx, track, track_index)
+The above exception was the direct cause of the following exception:
+
+Traceback (most recent call last):
+ File "/home/vee/sauce-v3x/venv/lib/python3.9/site-packages/discord/ext/commands/bot.py", line 939, in invoke
+ await ctx.command.invoke(ctx)
+ File "/home/vee/sauce-v3x/venv/lib/python3.9/site-packages/discord/ext/commands/core.py", line 863, in invoke
+ await injected(*ctx.args, **ctx.kwargs)
+ File "/home/vee/sauce-v3x/venv/lib/python3.9/site-packages/discord/ext/commands/core.py", line 94, in wrapped
+ raise CommandInvokeError(exc) from exc
+discord.ext.commands.errors.CommandInvokeError: Command raised an exception: AttributeError: 'Music_base' object has no attribute 'guild_queues'
+
+I hope you can understand guys and thank you so much in advance
","Non-static methods can't be called without an instance of the class.
+You should be able to get the created instance of the class using bot.get_cog passing the name of the Cog.
+Then you can use that instance to call the method.
+music_base = self.bot.get_cog("Music_base")
+music_base.func(ctx, ...)
+
+If it is a command you want to run, you can also get the command with bot.get_command, store it in a variable, and execute it as a function.
",python
+"ModuleNotFoundError: No module named 'xxx' with my custom type in Visual Studio codeI try to run the code below by clicking the red box Run Python File in the screenshot below :
+car_client.py
+from MyLib.car import Car
+
+car = Car()
+print(car.get_name())
+
+But I get the error below:
+
+from MyLib.car import Car
+
+ModuleNotFoundError: No module named 'MyLib'
+
+car.py
+class Car:
+ def get_name(self):
+ return 'BMW'
+
+![]()
","Add the following to the top of the code in the car_client.py file:
+import sys
+sys.path.append("./MyLib")
+
+and modify the import as
+from car import Car
+
+the code will work fine
+![]()
",python
+"Group data in pivot Pandashello I made a table with the help of a pandas, then I created a pivot with two indexes., I would like to group this data like that, but one index was the header of another. Below I will show what is happening in the tables and what I would like the result
+Table:
+
+
+
+
+| Name |
+Lang |
+Skill |
+Corp |
+
+
+
+
+| Michael |
+java |
+2 |
+Google |
+
+
+| Piter |
+C++ |
+3 |
+Facebook |
+
+
+| Cristiano |
+python |
+5 |
+Google |
+
+
+| Michael |
+java |
+1 |
+Facebook |
+
+
+| Piter |
+C++ |
+2 |
+Google |
+
+
+| Cristiano |
+python |
+3 |
+Facebook |
+
+
+| Michael |
+java |
+4 |
+Google |
+
+
+| Piter |
+C++ |
+5 |
+Facebook |
+
+
+| Cristiano |
+python |
+1 |
+Google |
+
+
+| Michael |
+python |
+2 |
+Facebook |
+
+
+
+
+I used:
+pivot = pd.pivot_table(df, values="Skill", index=["Corp", "Name"], columns = "Lang", aggfunc="sum")
+
+
+and I have pivot:
+
+
+
+
+| Corp |
+Name |
+C++ |
+java |
+python |
+
+
+
+
+| Facebook |
+Cristiano |
+nan |
+nan |
+3 |
+
+
+| Facebook |
+Michael |
+nan |
+1 |
+2 |
+
+
+| Facebook |
+Piter |
+8 |
+nan |
+nan |
+
+
+| Google |
+Cristiano |
+nan |
+nan |
+6 |
+
+
+| Google |
+Michael |
+nan |
+6 |
+nan |
+
+
+| Google |
+Piter |
+2 |
+nan |
+nan |
+
+
+
+
+the result I would like:
+
+
+
+
+| Name |
+C++ |
+java |
+python |
+
+
+
+
+| Facebook |
+sum_fb |
+sum_fb |
+sum_fb |
+
+
+| Cristiano |
+nan |
+nan |
+3 |
+
+
+| Michael |
+nan |
+1 |
+2 |
+
+
+| Piter |
+8 |
+nan |
+nan |
+
+
+| Google |
+sum_google |
+sum_google |
+sum_google |
+
+
+| Cristiano |
+nan |
+nan |
+6 |
+
+
+| Michael |
+nan |
+6 |
+nan |
+
+
+| Piter |
+2 |
+nan |
+nan |
+
+
+
+
+Thank You in advance
","You can aggregate sum by Corp level, which is first by GroupBy.sum, append index to MultiIndex with same values:
+df1 = (pivot.groupby(level=0).sum()
+ .assign(Name = lambda x: x.index)
+ .set_index('Name', append=True))
+
+Or use level=[0,0] for MultiIndex, only necessary set names by DataFrame.rename_axis:
+df1 = pivot.groupby(level=[0,0]).sum().rename_axis(['Corp','Name'])
+print (df1)
+Lang C++ java python
+Corp Name
+Facebook Facebook 8.0 1.0 5.0
+Google Google 2.0 6.0 6.0
+
+Then is appended pivot DataFrame by concat, but is necessary sorting for correct ordering by first level of MultiIndex with DataFrame.sort_index, remove first level Corp by DataFrame.droplevel and last convert Name to column with remove columns name Lang by DataFrame.rename_axis:
+df = (pd.concat([df1, pivot])
+ .sort_index(level=0, sort_remaining=False)
+ .droplevel(0)
+ .reset_index()
+ .rename_axis(None, axis=1))
+print (df)
+ Name C++ java python
+0 Facebook 8.0 1.0 5.0
+1 Cristiano NaN NaN 3.0
+2 Michael NaN 1.0 2.0
+3 Piter 8.0 NaN NaN
+4 Google 2.0 6.0 6.0
+5 Cristiano NaN NaN 6.0
+6 Michael NaN 6.0 NaN
+7 Piter 2.0 NaN NaN
+
",python
+"Should I retrain the entire model with additional data?I trained a model with the default training data. However, when testing on examples from production, the testing accuracy is very low. Do I have to add new data from production? Is it necessary to train the entire model again? Should I use all new data or only some of it? I have tried using all the new data. This is what my code looks like:
+ # In[]: Retrain the module(s) with the new data: Strategy 1: All old+new data
+ model=load_model(‘model.h5’)
+ (x_train_old, y_train_old)=dataset.load_old_data()
+ (x_train_new, y_train_new)=dataset.load_new_data()
+ x_train=x_train_old+x_train_new
+ y_train=y_train_old+y_train_new
+
+ model.fit(x_train,y_train, epoch=nb_epoch,batch_size=128)
+
+
+ # In[]: Retrain the module(s) with the new data: Strategy 2: Only new data
+ model=load_model(‘model.h5’)
+ (x_train_new, y_train_new)=dataset.load_new_data()
+
+ model.fit(x_train_new,y_train_new, epoch=nb_epoch,batch_size=128)
+
+
+Here, model.h5 is the already trained module.
","Yes, your should retrain then model with the new data. This is known as MLops in which you closed check the model accuracy after it is gone on production and it the model accuracy is decreasing with the surge of new data. you retrain the model. so do retrain them model on new data.
",python
+"Django model auto update status when the medicine is expiredHow to Auto update the status when the medicine is expired? for example, I created medicine record and set the expirationdate to july 2022, if the current date is greater than the expirationdate it will update the status of medicine to "Expired", How do I do that?
+class Medicine(models.Model):
+ medstatus = [
+ ("Expired", "Expired"),
+ ("Active", "Active")
+ ]
+ description = models.CharField(max_length=50, null=True)
+ expirationDate = models.DateField(null=True)
+ status = models.CharField(max_length=50, choices=medstatus, null=True)
+
","Create crontab for one time in a day as:
+#this is example for each day 00:00 sending get request to you api
+0 0 * * * /usr/bin/curl --silent http://example.com/your_views_url/ &>/dev/null
+
+This is good and easy to understand cronjob generator:
+https://crontab.guru/#0_0_*_*_*
+Cronjob will trigger a view, which will search for a medicine, and change its status:
+urls.py:
+path("your_views_url", views.your_view_name_here),
+
+views.py:
+from django.http import HttpResponse
+from datetime import datetime
+
+def your_view_name_here(request):
+ now = datetime.today()
+
+ #this will return a queryset of all medecine which you need to change status
+ all_target_medecine = Medicine.objects.filter(medstatus = "Active", expirationDate__lt = now)
+
+ for medecine in all_target_medecine:
+ medecine.medstatus = "Expired"
+ medecine.save()
+ return HttpResponse("Ok")
+
",python
+"Aggregate points into a grid using PolarsI have a points dataset in the following format (x, y, value), is it possible to get aggregated dataset using Polars native (maybe even lazy) code as much as possible?
+Basically I want to create a virtual grid and then sum all the points in respective grid cells, x_step and y_step being grid cell dimensions. The output would be a dataset with columns (cell_x, cell_y, agg_value) while cell_x and cell_y columns only taking values divisible by my predefined steps (representing bottom left coordinate of the grid cell), and agg_value being a sum of all point values inside the cell:
+(cell_x >= x and x < cell_x + x_step) and (cell_y >= y and y < cell_y + y_step).
+Currently I am iterating from start, incrementing my variable cell_x by x_step and the same for Y axis in a nested loop. Then I call sum on the filtered subset (cell) of points and I output one row to the output cell. It is rather slow in Python.
+Here is a visual example, all points have value of 1 for simplicity:
+![]()
","Edit: As of Polars 0.14.1, we can use the // operator as floor division instead of using floor, so that the algorithm becomes:
+step_x = 2
+step_y = 2
+(
+ df.with_columns(
+ [
+ ((pl.col("x") // step_x) * step_x).alias("cell_x"),
+ ((pl.col("y") // step_y) * step_y).alias("cell_y"),
+ ]
+ )
+ .groupby(["cell_x", "cell_y"])
+ .agg(
+ [
+ pl.col("val").sum().alias("agg_value"),
+ ]
+ )
+ .sort(["cell_x", "cell_y"]).collect()
+)
+
+One easy (and performant) way to solve this is to use the floor function to calculate the grid coordinates. We can easily accomplish all of this in Lazy mode, and using only Polars Expressions. (And best of all, no slow nested for loops.)
+Let's start with your data. We'll put the DataFrame in Lazy mode.
+import polars as pl
+df = (
+ pl.DataFrame(
+ {
+ "x": [0.5, 2, 2.5, 5.5],
+ "y": [1.5, 2.5, 3.5, 3.5],
+ "val": [1, 1, 1, 1],
+ }
+ )
+).lazy()
+df.collect()
+
+shape: (4, 3)
+┌─────┬─────┬─────┐
+│ x ┆ y ┆ val │
+│ --- ┆ --- ┆ --- │
+│ f64 ┆ f64 ┆ i64 │
+╞═════╪═════╪═════╡
+│ 0.5 ┆ 1.5 ┆ 1 │
+├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
+│ 2.0 ┆ 2.5 ┆ 1 │
+├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
+│ 2.5 ┆ 3.5 ┆ 1 │
+├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
+│ 5.5 ┆ 3.5 ┆ 1 │
+└─────┴─────┴─────┘
+
+We can aggregate the values into grid cells as follows:
+step_x = 2
+step_y = 2
+(
+ df.with_columns(
+ [
+ ((pl.col("x") / step_x).floor() * step_x).alias("cell_x"),
+ ((pl.col("y") / step_y).floor() * step_y).alias("cell_y"),
+ ]
+ )
+ .groupby(["cell_x", "cell_y"])
+ .agg(
+ [
+ pl.col("val").sum().alias("agg_value"),
+ ]
+ )
+ .sort(["cell_x", "cell_y"])
+ .collect()
+)
+
+shape: (3, 3)
+┌────────┬────────┬───────────┐
+│ cell_x ┆ cell_y ┆ agg_value │
+│ --- ┆ --- ┆ --- │
+│ f64 ┆ f64 ┆ i64 │
+╞════════╪════════╪═══════════╡
+│ 0.0 ┆ 0.0 ┆ 1 │
+├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
+│ 2.0 ┆ 2.0 ┆ 2 │
+├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
+│ 4.0 ┆ 2.0 ┆ 1 │
+└────────┴────────┴───────────┘
+
+Since the calculation of the coordinates is in a with_columns context, they will run in parallel.
+Other Notes
+To trace how the algorithm maps each point to the lower-left coordinate of the grid that contains it, just comment out the groupby and agg methods.
+step_x = 2
+step_y = 2
+(
+ df.with_columns(
+ [
+ ((pl.col("x") / step_x).floor() * step_x).alias("cell_x"),
+ ((pl.col("y") / step_y).floor() * step_y).alias("cell_y"),
+ ]
+ )
+ .sort(["cell_x", "cell_y"])
+ .collect()
+)
+
+shape: (4, 5)
+┌─────┬─────┬─────┬────────┬────────┐
+│ x ┆ y ┆ val ┆ cell_x ┆ cell_y │
+│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
+│ f64 ┆ f64 ┆ i64 ┆ f64 ┆ f64 │
+╞═════╪═════╪═════╪════════╪════════╡
+│ 0.5 ┆ 1.5 ┆ 1 ┆ 0.0 ┆ 0.0 │
+├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 2.0 ┆ 2.5 ┆ 1 ┆ 2.0 ┆ 2.0 │
+├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 2.5 ┆ 3.5 ┆ 1 ┆ 2.0 ┆ 2.0 │
+├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 5.5 ┆ 3.5 ┆ 1 ┆ 4.0 ┆ 2.0 │
+└─────┴─────┴─────┴────────┴────────┘
+
+Also, I was careful to design the algorithm to work with negative grid coordinates (if you need that). In general, you have to be careful to distinguish between "truncating" and "casting" which converts -2.5 to -2, versus "floor" which converts -2.5 to -3. (We want "floor" in this case.)
+For example:
+import numpy as np
+
+rng = np.random.default_rng(1)
+nbr_rows = 10
+df = (
+ pl.DataFrame(
+ {
+ "x": rng.uniform(-10, 10, nbr_rows),
+ "y": rng.uniform(-10, 10, nbr_rows),
+ "val": rng.integers(1, 10, nbr_rows),
+ }
+ )
+ .with_columns([pl.col(["x", "y"]).round(1).keep_name()])
+ .sort(["x", "y"])
+ .lazy()
+)
+df.collect()
+
+shape: (10, 3)
+┌──────┬──────┬─────┐
+│ x ┆ y ┆ val │
+│ --- ┆ --- ┆ --- │
+│ f64 ┆ f64 ┆ i64 │
+╞══════╪══════╪═════╡
+│ -9.4 ┆ -4.8 ┆ 9 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ -7.1 ┆ -3.4 ┆ 1 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ -3.8 ┆ -3.9 ┆ 5 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ -1.8 ┆ -1.9 ┆ 9 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ -1.5 ┆ -0.9 ┆ 5 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ 0.2 ┆ 5.1 ┆ 1 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ 1.0 ┆ -5.9 ┆ 7 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ 6.6 ┆ -7.3 ┆ 2 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ 9.0 ┆ 0.8 ┆ 7 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┤
+│ 9.0 ┆ 5.8 ┆ 3 │
+└──────┴──────┴─────┘
+
+The algorithm maps each point to lower-left grid coordinates as follows:
+shape: (10, 5)
+┌──────┬──────┬─────┬────────┬────────┐
+│ x ┆ y ┆ val ┆ cell_x ┆ cell_y │
+│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
+│ f64 ┆ f64 ┆ i64 ┆ f64 ┆ f64 │
+╞══════╪══════╪═════╪════════╪════════╡
+│ -9.4 ┆ -4.8 ┆ 9 ┆ -10.0 ┆ -6.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ -7.1 ┆ -3.4 ┆ 1 ┆ -8.0 ┆ -4.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ -3.8 ┆ -3.9 ┆ 5 ┆ -4.0 ┆ -4.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ -1.8 ┆ -1.9 ┆ 9 ┆ -2.0 ┆ -2.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ -1.5 ┆ -0.9 ┆ 5 ┆ -2.0 ┆ -2.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 1.0 ┆ -5.9 ┆ 7 ┆ 0.0 ┆ -6.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 0.2 ┆ 5.1 ┆ 1 ┆ 0.0 ┆ 4.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 6.6 ┆ -7.3 ┆ 2 ┆ 6.0 ┆ -8.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 9.0 ┆ 0.8 ┆ 7 ┆ 8.0 ┆ 0.0 │
+├╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
+│ 9.0 ┆ 5.8 ┆ 3 ┆ 8.0 ┆ 4.0 │
+└──────┴──────┴─────┴────────┴────────┘
+
+
",python
+"How to save a DataFrame to excel with caption in PythonI'm trying to set a caption for my DataFrame and export it to excel but when I check the excel workbook the caption is not reflected.
+
+
+
import pandas as pd
+
+data={'Name':['Karan','Rohit','Sahil','Aryan'],'Age':[23,22,21,24]}
+
+df=pd.DataFrame(data)
+
+df.style.set_caption(""This is caption"").hide(axis='index').to_excel(""output.xlsx"", engine='xlsxwriter')
+
+
+
+Can anyone please help me to export the DataFrame with caption to xlsx?
+Thanks in advance!
","If you want to write on the first row, you can use:
+with pd.ExcelWriter('output.xlsx') as writer:
+ df.to_excel(writer, sheet_name='Data', index=False, startrow=1)
+ ws = writer.book.get_worksheet_by_name('Data')
+ ws.write('A1', 'This is a caption')
+
+![]()
",python
+"zipfile.BadZipFile: File is not a zip file when using ""openpyxl"" engineI have created a script which dumps the excel sheets stored in S3 into my local postgres database. I've used pandas read_excel and ExcelFile method to read the excel sheets.
+Code for the same can be found here.
+import boto3
+import pandas as pd
+import io
+import os
+from sqlalchemy import create_engine
+import xlrd
+
+os.environ["AWS_ACCESS_KEY_ID"] = "xxxxxxxxxxxx"
+os.environ["AWS_SECRET_ACCESS_KEY"] = "xxxxxxxxxxxxxxxxxx"
+s3 = boto3.client('s3')
+
+obj = s3.get_object(Bucket='bucket-name', Key='file.xlsx')
+data = pd.ExcelFile(io.BytesIO(obj['Body'].read()))
+print(data.sheet_names)
+a = len(data.sheet_names)
+
+engine1 = create_engine('postgresql://postgres:postgres@localhost:5432/postgres')
+for i in range(a):
+ df = pd.read_excel(io.BytesIO(obj['Body'].read()),sheet_name=data.sheet_names[i], engine='openpyxl')
+ df.to_sql("test"+str(i), engine1, index=False)
+
+Basically, code parses the S3 bucket and runs in a loop. For each sheet, it creates a table
+and dumps the data from sheet in that table.
+Where I'm having trouble is, when I run this code, I get this error.
+df = pd.read_excel(io.BytesIO(obj['Body'].read()),sheet_name=data.sheet_names[i-1], engine='openpyxl')
+zipfile.BadZipFile: File is not a zip file
+
+This is coming after I added 'openpyxl' engine in read_excel method. When I remove the engine, I get this error.
+raise ValueError(
+ValueError: Excel file format cannot be determined, you must specify an engine manually.
+
+Please note that I can print the connection to database, so there is no problem in connectivity, and I'm using latest version of python and pandas. Also, I can get all the sheet_names in the excel file so I'm able to reach to that file as well.
+Many Thanks!
","You are reading the obj twice, fully:
+
+data = pd.ExcelFile(io.BytesIO(obj['Body'].read()))
+pd.read_excel(io.BytesIO(obj['Body'].read()), ...)
+
+Your object can only be .read() once, second read produce nothing, an empty b"".
+In order to avoid re-reading the S3 stream many times, you could store it once in a BytesIO, and rewind that BytesIO with seek.
+buf = io.BytesIO(obj["Body"].read())
+
+pd.ExcelFile(buf)
+
+buf.seek(0)
+
+pd.read_excel(buf, ...)
+
+# repeat
+
",python
+"Running task / function in the backgroundi wrote a program to capture the position of license plate with my webcam feed using YOLOv4. The result of the detection is then passed to easyOCR to do character identification. Right now, im calling the OCR function in the while loop everytime a detection occured. Is there a way to call the OCR function outside the loop without stopping the webcam feed ? some people suggested me to use queue or sub process but im not quite familiar with the concept. Any help would be very appreciated
+#detection
+while 1:
+ #_, pre_img = cap.read()
+ #pre_img= cv2.resize(pre_img, (640, 480))
+ _, img = cap.read()
+ #img = cv2.flip(pre_img,1)
+ hight, width, _ = img.shape
+ blob = cv2.dnn.blobFromImage(img, 1 / 255, (416, 416), (0, 0, 0), swapRB=True, crop=False)
+
+ net.setInput(blob)
+
+ output_layers_name = net.getUnconnectedOutLayersNames()
+
+ layerOutputs = net.forward(output_layers_name)
+
+ boxes = []
+ confidences = []
+ class_ids = []
+
+ for output in layerOutputs:
+ for detection in output:
+ score = detection[5:]
+ class_id = np.argmax(score)
+ confidence = score[class_id]
+ if confidence > 0.7:
+ center_x = int(detection[0] * width)
+ center_y = int(detection[1] * hight)
+ w = int(detection[2] * width)
+ h = int(detection[3] * hight)
+ x = int(center_x - w / 2)
+ y = int(center_y - h / 2)
+ boxes.append([x, y, w, h])
+ confidences.append((float(confidence)))
+ class_ids.append(class_id)
+
+ indexes = cv2.dnn.NMSBoxes(boxes, confidences, .5, .4)
+
+ boxes = []
+ confidences = []
+ class_ids = []
+
+ for output in layerOutputs:
+ for detection in output:
+ score = detection[5:]
+ class_id = np.argmax(score)
+ confidence = score[class_id]
+ if confidence > 0.5:
+ center_x = int(detection[0] * width)
+ center_y = int(detection[1] * hight)
+ w = int(detection[2] * width)
+ h = int(detection[3] * hight)
+
+ x = int(center_x - w / 2)
+ y = int(center_y - h / 2)
+
+ boxes.append([x, y, w, h])
+ confidences.append((float(confidence)))
+ class_ids.append(class_id)
+
+ indexes = cv2.dnn.NMSBoxes(boxes, confidences, .8, .4)
+ font = cv2.FONT_HERSHEY_PLAIN
+ colors = np.random.uniform(0, 255, size=(len(boxes), 3))
+ if len(indexes) > 0:
+ for i in indexes.flatten():
+ x, y, w, h = boxes[i]
+ label = str(classes[class_ids[i]])
+ confidence = str(round(confidences[i], 2))
+ color = colors[i]
+ cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
+ detected_image = img[y:y+h, x:x+w]
+ cv2.putText(img, label + " " + confidence, (x, y + 400), font, 2, color, 2)
+ #print(detected_image)
+ cv2.imshow('detection',detected_image)
+
+ result = OCR(detected_image)
+ print(result)
+
+Function for OCR
+def OCR(cropped_image):
+
+ result = reader.readtext(cropped_image)
+ text = ''
+ for result in result:
+ text += result[1] + ' '
+
+ spliced = (remove(text)).upper()
+ return spliced
+
","You could run the OCR function on an other thread with the thread library like so:
+import time # not necessary only to simulate work time
+import _thread as thread # in python 3 the name has changed to _thread
+
+
+def OCR(cropped_image):
+
+ result = reader.readtext(cropped_image)
+ text = ''
+ for result in result:
+ text += result[1] + ' '
+
+ spliced = (remove(text)).upper()
+ print(spliced) # you would have to print the result in the OCR function because you can't easily return stuff
+
+
+while 1:
+ time.sleep(5) # simulating some work time
+ print("main")
+
+ detected_image = 1
+ thread.start_new_thread(OCR, (detected_image,)) # calling the OCR function on a new thread.
+
+I hope it will help you...
",python
+"Pydantic exclude field from __eq__ to avoid recursion errorI have a pydantic model like this:
+class SomeModel(pydantic.BaseModel):
+ name: str
+ content: str
+ previous_model: typing.Optional["SomeModel"] = None
+
+My code look like this, this is greatly simplified, In my real code there are many and circular dependencies occur by chance occasionally rather than being purposefully created:
+models = [
+ SomeModel("bob", "2"),
+ SomeModel("bob", "2"),
+]
+models[0].previous_model = models[1]
+models[1].previous_model = models[0]
+
+models.remove(models[0])
+
+This throws the following error:
+File "c:\Users\username\project-name\src\main.py", line 108, in run_all
+ models.remove(models[0])
+ File "pydantic\main.py", line 902, in pydantic.main.BaseModel.__eq__
+ File "pydantic\main.py", line 445, in pydantic.main.BaseModel.dict
+ File "pydantic\main.py", line 861, in _iter
+ File "pydantic\main.py", line 736, in pydantic.main.BaseModel._get_value
+ File "pydantic\main.py", line 445, in pydantic.main.BaseModel.dict
+ File "pydantic\main.py", line 861, in _iter
+ File "pydantic\main.py", line 736, in pydantic.main.BaseModel._get_value
+ File "pydantic\main.py", line 445, in pydantic.main.BaseModel.dict
+ File "pydantic\main.py", line 861, in _iter
+ File "pydantic\main.py", line 736, in pydantic.main.BaseModel._get_value
+ File "pydantic\main.py", line 445, in pydantic.main.BaseModel.dict
+ File "pydantic\main.py", line 861, in _iter
+ File "pydantic\main.py", line 736, in pydantic.main.BaseModel._get_value
+ File "pydantic\main.py", line 445, in pydantic.main.BaseModel.dict
+ File "pydantic\main.py", line 861, in _iter
+
+... Snip several hunderd more lines
+
+ File "pydantic\main.py", line 734, in pydantic.main.BaseModel._get_value
+ File "pydantic\main.py", line 304, in pydantic.main.ModelMetaclass.__instancecheck__
+RecursionError: maximum recursion depth exceeded
+
+I don't really need the previous_model field to be included in the equality at all. Is there a way to exclude it so that my stack doesn't overflow? This field is irrelevant for the purpose equality in this case.
","When pydantic generates __repr__, it iterates over its arguments. That makes sense, that's one of the key selling points. But it doesn't work well in your scenario, you'd have to omit previous_node from __repr__ to make it work. You can either skip previous_node in __repr_args__ or return something simpler in __repr__. To give you a very simplified example that is working
+
+import typing
+
+import pydantic
+
+
+class SomeModel(pydantic.BaseModel):
+ name: str
+ content: str
+ previous_model: typing.Optional["SomeModel"] = None
+
+ def __repr__(self):
+ return self.name
+
+
+SomeModel.update_forward_refs()
+
+
+models = [
+ SomeModel(name="bob", content="2"),
+ SomeModel(name="bob", content="2"),
+]
+models[0].previous_model = models[1]
+models[1].previous_model = models[0]
+
+models.remove(models[0])
+
+print(models)
+
+Less simple version that's closer to how pydantic behaves but will also work in your case
+
+import typing
+
+import pydantic
+
+
+class SomeModel(pydantic.BaseModel):
+ name: str
+ content: str
+ previous_model: typing.Optional["SomeModel"] = None
+
+ def __repr_args__(self, *args, **kwargs):
+ args = self.dict(exclude={'previous_model',})
+ return list(args.items())
+
+
+SomeModel.update_forward_refs()
+
+
+models = [
+ SomeModel(name="bob", content="2"),
+ SomeModel(name="bob", content="2"),
+]
+models[0].previous_model = models[1]
+models[1].previous_model = models[0]
+
+models.remove(models[0])
+
",python
+"Convert a list of ""dictionary of dictionaries"" to a dataframeI have a list of "dictionary of dictionaries" that looks like this:
+lis = [{'Health and Welfare Plan + Change Notification': {'evidence_capture': 'null',
+ 'test_result_justification': 'null',
+ 'latest_test_result_date': 'null',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'Not Started',
+ 'test_result': 'null'}},
+ {'Health and Welfare Plan + Computations': {'evidence_capture': 'null',
+ 'test_result_justification': 'null',
+ 'latest_test_result_date': 'null',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'Not Started',
+ 'test_result': 'null'}},
+ {'Health and Welfare Plan + Data Agreements': {'evidence_capture': 'null',
+ 'test_result_justification': 'Due to the Policy',
+ 'latest_test_result_date': '2019-10-02',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'In Progress',
+ 'test_result': 'null'}},
+ {'Health and Welfare Plan + Data Elements': {'evidence_capture': 'null',
+ 'test_result_justification': 'xxx',
+ 'latest_test_result_date': '2019-10-02',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'In Progress',
+ 'test_result': 'null'}},
+ {'Health and Welfare Plan + Data Quality Monitoring': {'evidence_capture': 'null',
+ 'test_result_justification': 'xxx',
+ 'latest_test_result_date': '2019-08-09',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'Completed',
+ 'test_result': 'xxx'}},
+ {'Health and Welfare Plan + HPU Source Reliability': {'evidence_capture': 'null',
+ 'test_result_justification': 'xxx.',
+ 'latest_test_result_date': '2019-10-02',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'In Progress',
+ 'test_result': 'null'}},
+ {'Health and Welfare Plan + Lineage': {'evidence_capture': 'null',
+ 'test_result_justification': 'null',
+ 'latest_test_result_date': 'null',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'Not Started',
+ 'test_result': 'null'}},
+ {'Health and Welfare Plan + Metadata': {'evidence_capture': 'null',
+ 'test_result_justification': 'Valid',
+ 'latest_test_result_date': '2020-07-02',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'Completed',
+ 'test_result': 'xxx'}},
+ {'Health and Welfare Plan + Usage Reconciliation': {'evidence_capture': 'null',
+ 'test_result_justification': 'Test out of scope',
+ 'latest_test_result_date': '2019-10-02',
+ 'last_updated_by': 'null',
+ 'test_execution_status': 'In Progress',
+ 'test_result': 'null'}}]
+
+I would like to convert the list into a dataframe that looks like this:
+ evidence_capture last_updated_by latest_test_result_date test_execution_status test_result test_result_justification test_category
+Change Notification null null null Not Started null null Health and Welfare Plan
+Computations null null null Not Started null null Health and Welfare Plan
+Data Agreements null null 2019-10-02 In Progress null Due to the Policy Health and Welfare Plan
+Data Elements null null 2019-10-02 In Progress null xxx Health and Welfare Plan
+Data Quality Monitoring null null 2019-08-09 Completed xxx xxx Health and Welfare Plan
+HPU Source Reliability null null 2019-10-02 In Progress null xxx. Health and Welfare Plan
+Lineage null null null Not Started null null Health and Welfare Plan
+Metadata null null 2020-07-02 Completed xxx Valid Health and Welfare Plan
+Usage Reconciliation null null 2019-10-02 In Progress null Test out of scope Health and Welfare Plan
+
+My code to build the dataframe is using a for-loop to concat the records column by column. After that to process the column names, and then transpose it. The final output would have the repeated string "Health and Welfare Plan" removed from each row index, but appended as a new column.
+df3 = pd.DataFrame(lis[0])
+for i in range(1, len(lis)):
+ df3 = pd.concat([df3, pd.DataFrame(lis[i])], axis=1)
+df3.columns = [col.split(' + ')[1] for col in df3.columns]
+df3 = df3.T
+df3['test_category'] = 'Health and Welfare Plan'
+print(df3)
+
+The code is able to produce the final output, but using "expensive" functions of both for-loop and dataframe concat. So I was wondering if there is a better way to output the same results?
","Let us do dict comp to flatten the list of dictionaries
+pd.DataFrame({k.split(' + ')[1]: v for d in lis for k, v in d.items()}).T
+
+
+ evidence_capture test_result_justification latest_test_result_date last_updated_by test_execution_status test_result
+Change Notification null null null null Not Started null
+Computations null null null null Not Started null
+Data Agreements null Due to the Policy 2019-10-02 null In Progress null
+Data Elements null xxx 2019-10-02 null In Progress null
+Data Quality Monitoring null xxx 2019-08-09 null Completed xxx
+HPU Source Reliability null xxx. 2019-10-02 null In Progress null
+Lineage null null null null Not Started null
+Metadata null Valid 2020-07-02 null Completed xxx
+Usage Reconciliation null Test out of scope 2019-10-02 null In Progress null
+
",python
+"How can I replace each occurrence of a substring independently of the others?I have a sentence like this:
+sentence = "it is a stupid book and a stupid hat and a stupid computer"
+
+and I want to replace the word "stupid" with the word "silly". And create a sentence for each replacement.
+The result is something like this:
+
+"it is a silly book and a stupid hat and a stupid computer"
+
+"it is a stupid book and a silly hat and a stupid computer"
+
+"it is a stupid book and a stupid hat and a silly computer"
+
+
+I mean the creating just 3 sentences each one for each replacement.
+I found the method replace() in Python but the count parameter determines the number of replacements not the one that I want. Could anyone help me?
","One solution:
+string = "it is a stupid book and a stupid hat and a stupid computer"
+
+for i in range(string.count('stupid')):
+ new = string.replace('stupid', 'silly', i+1)
+ new = new.replace('silly', 'stupid', i)
+ print(new)
+
+EDIT
+Another solution with a list comprehension:
+string = "it is a stupid book and a stupid hat and a stupid computer"
+
+word1 = 'stupid'
+word2 = 'silly'
+
+strings = [string.replace(word1, word2, i+1).replace(word2, word1, i) for i in range(string.count(word1))]
+
+print(strings)
+
+EDIT2
+If word2 happens to be in string (as W. Ding comment), then you can try this (choosing an unlikely string as long as you want):
+string = "silly...it is a stupid book and a stupid hat and a stupid computer"
+word1 = 'stupid'
+word2 = 'silly'
+unlikely = '$*$ù^$*ù*;;;,:;!::*ù*$^^$$^d^^d^eêê'
+
+
+strings = [string.replace(word1, unlikely, i+1).
+ replace(unlikely, word1, i).
+ replace(unlikely, word2) for i in range(string.count(word1))]
+
+print(strings)
+
",python
+"function that checks valid words with certain letters, loop functionsHello I am trying to create a function that will retrieve a word of any length and combination of the six letters such as odd, Pow. I want the function to see if the word is valid. If it is not I then want the person to type a new word in and then check that word to see if it is valid. If it is not I want to continue the loop until the word contains only the six letters. Thank you.
+accepeted_letters = "U", "P", "D", "O", "W", "N"
+user_input_uppercase = str(input("Select one or more of each
+letter to be displated [U, P, D, O, W, N], Thank you ")).upper()
+user_input_string = str(user_input_uppercase).replace(" ", "")
+
+def valid():
+ global accepeted_letters, user_input_string
+ while user_input_string[0] not in accepeted_letters:
+ user_input_uppercase = input("Please Enter one or more
+of the following letters: [U, P, D, O, W, N]").upper()
+ if user_input_uppercase in accepeted_letters:
+ pass
+
+valid()
+
","You need to prompt for input in a loop, and either break the loop or return the value from your function when the condition is met. An easy way to test your valid letter condition is to turn the user's input into a set and then check to see if it's a subset of the valid letters.
+valid_letters = "UPDOWN"
+
+def get_valid_letters():
+ while True:
+ user_letters = input(
+ "Select one or more of each letter: ["
+ + ", ".join(valid_letters)
+ + "], Thank you "
+ ).upper().replace(" ", "")
+ if set(user_letters).issubset(valid_letters):
+ return user_letters
+
+print(get_valid_letters(), "is valid!")
+
",python
+"python Twitch-chatbot MONKALOT encounters json error on startupPresently I'm trying to make MONKALOT run on a PythonAnywhere account (customized Web Developer). I have basic knowledge of Linux but unfortunately no knowledge of dev'oping python scripts but advanced knowledge of dev'oping Java (hope that helps).
+My success log so far:
+After upgrading my account to Web Developer level I finally made pip download the (requirements)[https://github.com/NMisko/monkalot/blob/master/requirements.txt] and half the internet (2 of 5GB used). All modules and dependencies seem to be successfully installed.
+I configured my own monkalot-channel including OAuth which serves as a staging instance for now. The next challenge was how to get monkalot starting up. Using python3.7 instead of python or any other python3 environment did the trick.
+But now I'm stuck. After "completing the training stage" the monkalot-script prematurely ends with the following message:
+[22:14] ...chat bot finished training.
+Traceback (most recent call last):
+ File "monkalot.py", line 72, in <module>
+ bots.append(TwitchBot(path))
+ File "/home/Chessalot/monkalot/bot/bot.py", line 56, in __init__
+ self.users = self.twitch.get_chatters()
+ File "/home/Chessalot/monkalot/bot/data_sources/twitch.py", line 25, in get_chatters
+ data = requests.get(USERLIST_API.format(self.channel)).json()
+ File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 900, in json
+ return complexjson.loads(self.text, **kwargs)
+ File "/usr/local/lib/python3.7/site-packages/simplejson/__init__.py", line 525, in loads
+ return _default_decoder.decode(s)
+ File "/usr/local/lib/python3.7/site-packages/simplejson/decoder.py", line 370, in decode
+ obj, end = self.raw_decode(s)
+ File "/usr/local/lib/python3.7/site-packages/simplejson/decoder.py", line 400, in raw_decode
+ return self.scan_once(s, idx=_w(s, idx).end())
+simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
+
+By now I figured out that monkalot tries to load the chatters list and expects at least an empty json array as result but actually seems to receive an empty string.
+So my question is: What can I do to make the monkalot-script work? Is monkalot's current version incompatible to the current Twitch-API? Are there any outdated python libraries which may cause the incompatibility? Or is there an unrecognized configuration issue preventing the script from running successfully?
+Thank you all in advance. Any ideas provided by you are highly appreciated.
","After I expected the response, I found out that I received a HTTP 400, Bad Request error WITHOUT any data in the HTTP response body. Since monkalot expects a JSON answer the errors were raised. This was due to the fact that in the channel configuration I used an uppercase letter whereas Twitch expects all letters lowercase.
",python
+"length of list read from a csv varies, depending on when a read is performedI'm trying to write a simple function that reads in a csv and stores its content inside a list and either returns said list formatted or a string containing the information on why the csv is empty.
+For clarification, the csv is provided by an API called via curl. If there is new data available, the csv-structure is as follows:
+{
+entry1;entry2;entry3
+entry1;entry2;entry3
+}
+
+however, if there is an error or no data is available, the structure is as follows:
+{error code / info / warning / whatever}
+
+so to gather the data, i'm first checking if the csv contains at least more than 2 rows, simply by reading the file and check the length returned via f.readlines(). if that is the case, i filter the data through a list comprehension to ignore the "{}" and return the list as a result. If the list length is below 3, that means there is definitely no data available and that theres solely the error code in line 0 - so i return f.readlines()[0].
+here's the code of the function that does the above said:
+def belege_to_list() -> Union[list, str, None]:
+ try:
+ with open(export_path + "Belege.csv", mode='r', encoding="utf-8") as f:
+ if 0 < len(f.readlines()) < 3:
+ return str(f.readlines()[0])
+ if len(f.readlines()) > 2:
+ belege_list = [i.replace("\n", "").split(";") for i in f.readlines()[1:-1]]
+ return belege_list
+ except FileNotFoundError as e:
+ print("Belegdatei nicht gefunden.")
+ with open(main_path + "log.txt", "a+", encoding="utf-8") as log:
+ log.write("[" + str(datetime.now()) + "]: " + "Belege nicht gefunden / Datei fehlt - Error: " + str(e) + "\n")
+ return None
+
+the csv i'm reading is most definitely containing one line - also at the runtime of the code:
+{"error":"sessionid expired or loginname,passwort failed"}
+
+the function is executed as follows:
+def main():
+ export_belege() # creates csv
+ belege = belege_to_list()
+ ...
+
+however, im always getting an IndexError: list index out of range at return str(f.readlines()[0]. Now heres the part i sincerely do not understand. If i print out len(f.readlines) BEFORE if 0 < len(f.readlines()) < 3:, the result is 1, the code works. If it's not printed out however, the len(f.readlines) is supposedly 0, giving me an IndexError: list index out of range. Does anyone have an idea why that might be the case?
","You should save the list returned by readlines() then refer to the list for subsequent operations. Something like this:
+from datetime import datetime
+export_path = 'foo'
+main_path = 'bar'
+def belege_to_list():
+ try:
+ with open(export_path + "Belege.csv", encoding="utf-8") as f:
+ lines = list(map(str.strip, f))
+ if len(lines) < 3:
+ return lines[0] if len(lines) > 0 else None
+ if lines[0][0] == '{' and lines[-1][0] == '}':
+ return [line.split(';') for line in lines[1:-1]]
+ except FileNotFoundError as e:
+ print("Belegdatei nicht gefunden.")
+ with open(main_path + "log.txt", "a+", encoding="utf-8") as log:
+ log.write("[" + str(datetime.now()) + "]: " + "Belege nicht gefunden / Datei fehlt - Error: " + str(e) + "\n")
+
",python
+"editing the first word in match, regexI have a regex problem. In found matches (these are lines from a text file) I want to edit them so I can get shortened version of these lines. For example:
+matches=[Dog Alex, Dog Chriss, Cat Susan, Lizard Bob, and so on]
+From this I want to get:
+new_version=[D.Alex, D.Chriss, C.Susan, and so on]
+So I wrote this code:
+for match in matches:
+ mpattern=re.compile(r'^\w (.+)')
+ match=re.sub(mpattern, r'[A-Z]\. (.+)', match)
+ list_of_descriptions.append(match)
+
+but it doesn't work correctly :(
+I need the program to find the first word (which could begin with any letter) and then shorten it to the first letter and add a dot. Could someone help me, please? I'm using Python 3.7.9.
","The regex should capture the first letter, but should allow for more letters in the first word. On the other hand, as you're not planning to change anything to the second word, there is no need to include it in the match.
+Side note: when you compile your regex, you can call the sub method on that object, instead of on re.
+So:
+mpattern = re.compile(r'^(\w)\w* ')
+list_of_descriptions = [mpattern.sub(r'\1.', match) for match in matches]
+
",python
+"Set subtraction while casting lists and stringI have two lists of strings. When I cast two lists into set() then subtraction works properly.
+>>> A = ['One', 'Two', 'Three', 'Four']
+>>> B = ['One']
+>>> print(list(set(A) - set(B)))
+['Three', 'Two', 'Four']
+
+However when variable B is a string and I cast it into set() then, the subtraction is not working.
+>>> A = ['One', 'Two', 'Three', 'Four']
+>>> B = 'One'
+>>> print(list(set(A) - set(B)))
+['One', 'Two', 'Three', 'Four']
+
+Is anyone able to explain me if its a bug or expected behavior?
","The set() function, when operating on a string, will generate a set of all characters:
+print(set("ABC")) # set(['A', 'C', 'B'])
+
+If you want a set with a single string ABC in it, then you need to pass a collection containing that string to set():
+print(set(["ABC"])) # set(['ABC'])
+
",python
+"python re.sub problems backreferencing from a functionI want to 'join' certain numbers, that clearly should be together, although I don't want them to join every number.
+What I have:
+'Canesten 1 500 mg meka kapsula za rodnicui'
+'Clexane 10 000 IU (100 mg)/1 ml otopina za injekciju'
+'Humulin M3 100 IU/ml suspenzija za injekciju u ulošku'
+'Docile10 000 IU/ml oralne kapi, otopina'
+'POLYGYNAX 35 000 IU / 35 000 IU / 100 000 IU kapsula za rodnicu, meka'
+'Prostin E2 2 mg gel za rodnicu'
+'Silapen K 1 000 000 IU filmom obložene tablete'
+
+I want to have:
+'Canesten 1500 mg meka kapsula za rodnicui'
+'Clexane 10000 IU (100 mg)/1 ml otopina za injekciju'
+'Humulin M3 100 IU/ml suspenzija za injekciju u ulošku'
+'Docile10000 IU/ml oralne kapi, otopina'
+'POLYGYNAX 35000 IU / 35000 IU / 100000 IU kapsula za rodnicu, meka'
+'Prostin E2 2 mg gel za rodnicu'
+'Silapen K 1000000 IU filmom obložene tablete'
+
+It may be easier to see which ones I'm trying to join here: https://regex101.com/r/Ht9ZVi/1
+I can match each one of the numbers I want to join using ([^a-zA-Z](?:\d+\s+)*\d+\s\d+0{2}), but because this regex is not perfect regarding the blank spaces I thought about using a function to only remove the blank spaces between numbers.
+def spaces(s):
+ return re.sub('(?<=\d) (?=\d)', '', s)
+
+cr['Name'].apply(lambda x: re.sub(r"([^a-zA-Z](?:\d+\s*)*\d+\s\d+0{2})", spaces(r'\1'), x))
+
+This returns the strings unaltered, what am I doing wrong?
+I know this is a common question, and the solution is probably really simple but I can't wrap my head around it..
","In your pattern you want to match a leading single char other than a-zA-Z with [^a-zA-Z], but you can assert not an uppercase A-Z directly to the left instead to account for Docile10 000
+Then you don't need a capture group and you could match the digits with at least 1 space in between followed by asserting one of the allowed units.
+Then remove the spaces from the match with .group(0)
+This part [^\S\n]+ matches whitespace chars without newlines. If you want to allow crossing newlines, you can use \s+ instead
+(?<![A-Z])\d+(?:[^\S\n]+\d+)+(?=[^\S\n]*(?:mg|IU)\b)
+
+Regex demo
+You can also omit the assertion for the unit at the end for the current example data:
+(?<![A-Z])\d+(?:[^\S\n]+\d+)+
+
+Example
+strings = [
+ 'Canesten 1 500 mg meka kapsula za rodnicui',
+ 'Clexane 10 000 IU (100 mg)/1 ml otopina za injekciju',
+ 'Humulin M3 100 IU/ml suspenzija za injekciju u ulošku',
+ 'Docile10 000 IU/ml oralne kapi, otopina',
+ 'POLYGYNAX 35 000 IU / 35 000 IU / 100 000 IU kapsula za rodnicu, meka',
+ 'Prostin E2 2 mg gel za rodnicu',
+ 'Silapen K 1 000 000 IU filmom obložene tablete'
+]
+
+pattern = r"(?<![A-Z])\d+(?:[^\S\n]+\d+)+(?=[^\S\n]*(?:mg|IU)\b)"
+
+for s in strings:
+ print(re.sub(pattern, lambda x: re.sub(r"\s+", "", x.group()), s))
+
+Output
+Canesten 1500 mg meka kapsula za rodnicui
+Clexane 10000 IU (100 mg)/1 ml otopina za injekciju
+Humulin M3 100 IU/ml suspenzija za injekciju u ulošku
+Docile10000 IU/ml oralne kapi, otopina
+POLYGYNAX 35000 IU / 35000 IU / 100000 IU kapsula za rodnicu, meka
+Prostin E2 2 mg gel za rodnicu
+Silapen K 1000000 IU filmom obložene tablete
+
",python
+"Concatenate 2 dataframes and repeat values from small one with pandasI have these two dataframes:
+
+
+
+
+| Field1 |
+Field2 |
+
+
+
+
+| 0.5 |
+0.7 |
+
+
+| 2 |
+1 |
+
+
+| 3 |
+0.1 |
+
+
+| 4 |
+0.4 |
+
+
+
+
+and
+
+
+
+
+| Date |
+Time |
+
+
+
+
+| 2022-08-01 |
+1 |
+
+
+| 2022-08-01 |
+2 |
+
+
+
+
+and a I need to obtain the following:
+
+
+
+
+| Field1 |
+Field2 |
+Date |
+Time |
+
+
+
+
+| 0.5 |
+0.7 |
+2022-08-01 |
+1 |
+
+
+| 2 |
+1 |
+2022-08-01 |
+2 |
+
+
+| 3 |
+0.1 |
+2022-08-01 |
+1 |
+
+
+| 4 |
+0.4 |
+2022-08-01 |
+2 |
+
+
+
+
+Thanks in advance
","You can elongate your second dataframe to match dimentions, and then concatenate it with first dataframe.
+import pandas as pd
+
+df1 = pd.DataFrame({'Field1': [0.5, 2, 3, 4], 'Field2': [0.7, 1, 0.1, 0.4]})
+print(df1)
+# Field1 Field2
+# 0 0.5 0.7
+# 1 2.0 1.0
+# 2 3.0 0.1
+# 3 4.0 0.4
+
+df2 = pd.DataFrame({'Date': ['2022-08-01', '2022-08-01'], 'Time': [1, 2]})
+print(df2)
+# Date Time
+# 0 2022-08-01 1
+# 1 2022-08-01 2
+
+n = int(df1.size / df2.size)
+df3 = pd.concat([df2] * n, axis=0).reset_index(drop=True)
+print(df3)
+# Date Time
+# 0 2022-08-01 1
+# 1 2022-08-01 2
+# 2 2022-08-01 1
+# 3 2022-08-01 2
+
+df4 = pd.concat([df1, df3], axis=1)
+print(df4)
+# Field1 Field2 Date Time
+# 0 0.5 0.7 2022-08-01 1
+# 1 2.0 1.0 2022-08-01 2
+# 2 3.0 0.1 2022-08-01 1
+# 3 4.0 0.4 2022-08-01 2
+
+or shorter:
+df4 = pd.concat([
+ df1,
+ pd.concat(
+ [df2] * int(df1.size / df2.size),
+ axis=0
+ ).reset_index(drop=True)
+], axis=1)
+
",python
+"problem with inserting data into the bar chart plotly graph_objectsso I'm trying to add the lists ( Internal, External ..) to the bar chart but the data goes to one column "internal"
+![]()
+it is supposed to be similar to this
+![]()
+here is the whole df and the work I did
+![]()
","In your case, the problem is:
+
+- You want to plot a bar chart with
plotly.graph_objects
+- What exactly is data that
go.Bar() needed. (status, escalation)
+
+Deduce from questions above,
+a) How do you fetch the minimum portion of data for the demo in Stack Overflow?
+b) How to prepare the data from your df and transform them available to plot.
+
+
+
+Preparing for the data we want to plot:
+import random
+import pandas as pd
+import plotly.graph_objects as go
+random.seed(42)
+sample_quntity = 50
+status = [random.choice(['Not Done','Something Else','Done']) for i in range(sample_quntity)]
+escalation = [random.choice(['Internal','External','Unspecified']) for i in range(sample_quntity)]
+df = pd.DataFrame({
+ 'status':status,
+ 'escalation':escalation
+})
+df
+###
+ status escalation
+0 Done Unspecified
+1 Not Done External
+2 Not Done Internal
+3 Done Unspecified
+4 Something Else External
+5 Not Done Unspecified
+⋮ ⋮ ⋮
+43 Something Else Unspecified
+44 Not Done Unspecified
+45 Not Done Internal
+46 Something Else Internal
+47 Not Done External
+48 Something Else External
+49 Something Else External
+
+
+
+
+Plot:
+# plot bar chart grouping with status and escalation and color by status
+group_list = df['escalation'].unique()
+palette = {"Not Done": "#d89a9e","Something Else":"#e0c1b3", "Done": "#aeb4a9"}
+
+fig = go.Figure(data=[
+ go.Bar(name='Not Done',
+ x=group_list,
+ y=df[df['status'] == 'Not Done'].groupby('escalation').size(),
+ marker_color=palette['Not Done']),
+ go.Bar(name='Something Else',
+ x=group_list,
+ y=df[df['status'] == 'Something Else'].groupby('escalation').size(),
+ marker_color=palette['Something Else']),
+ go.Bar(name='Done',
+ x=group_list,
+ y=df[df['status'] == 'Done'].groupby('escalation').size(),
+ marker_color=palette['Done'])])
+
+fig.update_layout(title='Group by Escalation, Color by Status', barmode='stack')
+fig.show()
+
+
+![]()
",python
+"Increasing number of permutations of all possible combinations of a list with repetitions allowedI am trying without success to understand how to use itertools to generate a list with all possible combinations of the elements of a list, with an increasing size of elements to pick and including repetitions.
+I would like to add also a separator:
+lis = ['a','b','c']
+separator = '/'
+total_number_of_combinations = 3
+permutation_list = ['a','b','c', 'a/a', 'a/b', 'a/c', 'b/a', 'b/b', 'b/c', 'c/a', 'c/b', 'c/c',
+ 'a/a/a', 'a/a/b', 'a/a/c', 'a/b/a', 'a/b/b', 'a/b/c', 'a/c/a', 'a/c/b', 'a/c/c'
+ 'b/a/a', 'b/a/b', 'b/a/c', 'b/b/a', 'b/b/b', 'b/b/c', 'b/c/a', 'b/c/b', 'b/c/c'
+ 'c/a/a', 'c/a/b', 'c/a/c', 'c/b/a', 'c/b/b', 'c/b/c', 'c/c/a', 'c/c/b', 'c/c/c']
+
+The list will have then len(lis)+len(lis)**2+len(lis)**3+...++len(lis)**n elements, with n=total_number_of_combinations.
+I need to keep the separator and the total_numbers_of_combinations changeables.
+I need this in a list that can be check as a condition for filtering a pandas DataFrame (i will check dt[dt.my_col.isin(permutation_list)])
+I appreciate any help or pointing to a duplicated topic or even an explanation of how to correctly state this problem, because I did not found any topic that answer this question (maybe I am using the wrong keywords...). Maybe also there is a function from another module that does that, but I don't know.
+UPDATE:
+Following the request of @Scott, here is my real case:
+lis = ['BRUTELE','COCKPIT EST', 'CIRCET']
+separator = ' / '
+total_number_of_combinations = 10
+
+so my final list need to have 88572 elements.
","What you are trying to get is a product, not a combination.
+lis = ["BRUTELE", "COCKPIT EST", "CIRCET"]
+separator = " / "
+total_number_of_combinations = 3
+
+result = list(
+ itertools.chain.from_iterable(
+ (separator.join(a) for a in itertools.product(lis, repeat=i))
+ for i in range(1, total_number_of_combinations + 1)
+ )
+)
+
+assert result == ['BRUTELE', 'COCKPIT EST', 'CIRCET', 'BRUTELE / BRUTELE', 'BRUTELE / COCKPIT EST', 'BRUTELE / CIRCET', 'COCKPIT EST / BRUTELE', 'COCKPIT EST / COCKPIT EST', 'COCKPIT EST / CIRCET', 'CIRCET / BRUTELE', 'CIRCET / COCKPIT EST', 'CIRCET / CIRCET', 'BRUTELE / BRUTELE / BRUTELE', 'BRUTELE / BRUTELE / COCKPIT EST', 'BRUTELE / BRUTELE / CIRCET', 'BRUTELE / COCKPIT EST / BRUTELE', 'BRUTELE / COCKPIT EST / COCKPIT EST', 'BRUTELE / COCKPIT EST / CIRCET', 'BRUTELE / CIRCET / BRUTELE', 'BRUTELE / CIRCET / COCKPIT EST', 'BRUTELE / CIRCET / CIRCET', 'COCKPIT EST / BRUTELE / BRUTELE', 'COCKPIT EST / BRUTELE / COCKPIT EST', 'COCKPIT EST / BRUTELE / CIRCET', 'COCKPIT EST / COCKPIT EST / BRUTELE', 'COCKPIT EST / COCKPIT EST / COCKPIT EST', 'COCKPIT EST / COCKPIT EST / CIRCET', 'COCKPIT EST / CIRCET / BRUTELE', 'COCKPIT EST / CIRCET / COCKPIT EST', 'COCKPIT EST / CIRCET / CIRCET', 'CIRCET / BRUTELE / BRUTELE', 'CIRCET / BRUTELE / COCKPIT EST', 'CIRCET / BRUTELE / CIRCET', 'CIRCET / COCKPIT EST / BRUTELE', 'CIRCET / COCKPIT EST / COCKPIT EST', 'CIRCET / COCKPIT EST / CIRCET', 'CIRCET / CIRCET / BRUTELE', 'CIRCET / CIRCET / COCKPIT EST', 'CIRCET / CIRCET / CIRCET']
+
+However, the number of resulting items is O(len(lis) ** total_number_of_combinations), which may be computationally expensive. A better method would be parsing a string by splitting at separators and testing the membership of each split string.
",python
+"Python pytest: local variable 'length' referenced before assignmentI am not sure what is wrong in the following code snippet.
+I have the following two versions of a function.
+Version 1
+def _check_array_lengths(self, data):
+ for i, values in data.items():
+ if i == 0:
+ length = len(values)
+ if length != len(values):
+ raise ValueError('All values must be the same length')
+
+When I run test, the above function fails with a msg
+"ERROR tests/test_dataframe.py - UnboundLocalError: local variable 'length' referenced before assignment"
+Version 2
+def _check_array_lengths(self, data):
+ for i, values in enumerate(data.values()):
+ if i == 0:
+ length = len(values)
+ if length != len(values):
+ raise ValueError('All values must be the same length')
+
+The test for this function works fine and I wonder why I don't see the same error msg(mentioned above) here. How that "enumerate" is causing this change in behavior!
+May be something really silly but I couldn't figure it out yet.
+Here is my test function
+def test_array_length(self):
+ with pytest.raises(ValueError):
+ pdc.DataFrame({'a': np.array([1, 2]),
+ 'b': np.array([1])})
+
+can you please help ?
","In the second version, the first value of i is guaranteed to be 0. So the condition if i == 0: will be true, and length will be set. Then the comparison if length != len(values): will be able to use the length variable.
+In the first version is i iterates over the dataframe indexes, not a numeric sequence. The values of i will be 'a' and 'b'. The if i == 0: condition will never be true, so you never set length, and get an error when you try to compare it.
",python
+"Custom markdown parser not working properlySo I am making my own little script to detect some markdown notations in a variable. Now there are two issues.
+
+- First of all, my current approach does not work when using the markdown as following (I am referring to the question mark):
+
+**am I working**?
+
+
+- Aside from that does the code look quite messy and I am pretty sure that it can be done more efficiently.
+
+I was wondering as to what could be a better way to do what I am trying.
+I want to add that I have looked for packages but these wont work in my case as I only want to allow bold, italic, underline and escaping (**bold**, _italic_ and __underline__).
+My current code:
+description = "**one more test** _just to be sure_ \*hi\* apparently __underline broke__?"
+
+for word in description.split(" "):
+ if word.startswith("**"):
+ description = description.replace(word, f'<b>{word.replace("**", "", 1)}')
+ word = word.replace("**", "", 1)
+ if word.endswith("**") and not word.endswith("\**"):
+ description = description.replace(word, f'{word.replace("**", "", 1)}</b>')
+ word = word.replace("**", "", 1)
+ if word.startswith("*"):
+ description = description.replace(word, f'<i>{word.replace("*", "", 1)}')
+ word = word.replace("*", "", 1)
+ if word.endswith("*") and not word.endswith("\*"):
+ description = description.replace(word, f'{word.replace("*", "", 1)}</i>')
+ word = word.replace("*", "", 1)
+ if word.startswith("__"):
+ description = description.replace(word, f'<u>{word.replace("__", "", 1)}')
+ word = word.replace("__", "", 1)
+ if word.endswith("__") and not word.endswith("\__"):
+ description = description.replace(word, f'{word.replace("__", "", 1)}</u>')
+ word = word.replace("__", "", 1)
+ if word.startswith("_"):
+ description = description.replace(word, f'<i>{word.replace("_", "", 1)}')
+ word = word.replace("_", "", 1)
+ if word.endswith("_") and not word.endswith("\_"):
+ description = description.replace(word, f'{word.replace("_", "", 1)}</i>')
+ word = word.replace("_", "", 1)
+
+print(description)
+
","First of all, you should rely on regex expressions to be able to cover the different cases. Also, working with a clean code by defining functions and loops is better for future changes and readability.
+So, to solve your special characters issue (like ?, !, ..etc.), we can have the function below:
+import re
+
+def _check_match(text, chars, start_or_end="start", excluded_chars='()?!'):
+ # check if the text ends or starts with the required characters
+ # but could have any of the excluded characters before (if start) or after (if end)
+ if start_or_end == "start":
+ pattern = re.compile(rf'([{excluded_chars}]*){re.escape(chars)}(.*)$')
+ elif start_or_end == "end":
+ pattern = re.compile(rf'(.*[^\\]){re.escape(chars)}([{excluded_chars}]*)$')
+ else:
+ return False, None, None
+ check = re.match(pattern, text)
+ if check:
+ return True, check.group(1), check.group(2)
+ return False, None, None
+
+def check_match_start(text, chars):
+ return _check_match(text, chars, "start")
+
+def check_match_end(text, chars):
+ return _check_match(text, chars, "end")
+
+Examples:
+check_match_start("**test", "**") >> (True, '', 'test')
+check_match_start("(**test", "**") >> (True, '(', 'test')
+check_match_end("test_?", "_") >> (True, 'test', '?')
+check_match_end("test__?", "*") >> (False, None, None)
+
+Then, we can have another function that loops on all the required markdowns to apply:
+def markdown_text(text):
+ markdown_criteria = [
+ ("**", ["<b>", '</b>']),
+ ("*", ["<i>", "</i>"]),
+ ("__", ["<u>", "</u>"]),
+ ("_", ["<i>", "</i>"])
+ ]
+ for m in markdown_criteria:
+ chars = m[0]
+ replacement_start = m[1][0]
+ replacement_end = m[1][1]
+ my_start_check, matched_special_chars, matched_text = check_match_start(text, chars)
+ if my_start_check:
+ text = f'{matched_special_chars}{replacement_start}{matched_text}'
+ my_end_check, matched_text, matched_special_chars = check_match_end(text, chars)
+ if my_end_check:
+ text = f'{matched_text}{replacement_end}{matched_special_chars}'
+ return text
+
+Examples:
+markdown_text("test__?") >> 'test</u>?'
+markdown_text("*test") >> '<i>test'
+markdown_text("**test**!") >> '<b>test</b>!'
+
+It is time to execute:
+description = "**one more test** _just to be sure_ \*hi\* apparently __underline broke__?"
+new_description = []
+for word in description.split(" "):
+ new_description.append(markdown_text(word))
+
+new_description = " ".join(new_description)
+print(new_description) >> <b>one more test</b> <i>just to be sure</i> \*hi\* apparently <u>underline broke</u>?
+
+This solution provides more flexibility to:
+
+- add any other needed markdowns by simply updating
markdown_criteria list (for example, you can add ("***", ["<b><i>", '</i></b>']) to have bold and italic together, but make sure to add it in the beginning of the list not to be overriden by the next bold and italic criteria)
+- exclude more or less special characters by simply updating the parameter
excluded_chars in _check_match function
+
+Needed enhancements:
+This solution doesn't cover the case for an instance if you have a word starting with ** in your phrase and there is no other word that ends with **, so you will end up having <b> without </b>. This can be solved if you can deal with your phrase also as a bulk in a regex expression instead of splitting the phrase with space and working with each word separately
",python
+"Unable to do web scraping from URL using Python AlchemyI have a script where I'm trying to web scraping the data into table. But I'm getting errors
+ raise exc.with_traceback(traceback)
+ValueError: No tables found
+
+Script :
+import pandas as pd
+import logging
+from sqlalchemy import create engine
+from urlib.parse import quote
+
+db_connection = {mysql}://{username}:{quote'pwd'}@{DB:port}
+ds_connection = create_engine(db_connection)
+a = pd.read_html("https://www.centralbank.ae/en/forex-eibor/exchange-rates/")
+df = pd.Dataframe(a[0])
+df_final = df.loc[:,['Currency','Rate']]
+df_final.to_sql('rate_table',db_connection,if_exists = append,index=false)
+
+Can anyone suggest on this
","One easy way to obtain those exchange rates would be to scrape the API accessed to retrieve information in page (check Dev Tools - network tab):
+import pandas as pd
+import requests
+from bs4 import BeautifulSoup
+
+headers = {'Accept-Language': 'en-US,en;q=0.9',
+ 'Referer': 'https://www.centralbank.ae/en/forex-eibor/exchange-rates/'
+
+ }
+r = requests.post('https://www.centralbank.ae/umbraco/Surface/Exchange/GetExchangeRateAllCurrency', headers=headers)
+dfs = pd.read_html(r.text)
+print(dfs[0].loc[:,['Currency','Rates']])
+
+This returns:
+
+
+
+
+ |
+Currency |
+Rates |
+
+
+
+
+| 0 |
+US Dollar |
+3.6725 |
+
+
+| 1 |
+Argentine Peso |
+0.026993 |
+
+
+| 2 |
+Australian Dollar |
+2.52753 |
+
+
+| 3 |
+Bangladesh Taka |
+0.038508 |
+
+
+| 4 |
+Bahrani Dinar |
+9.74293 |
+
+
+| 5 |
+Brunei Dollar |
+2.64095 |
+
+
+| 6 |
+Brazilian Real |
+0.706549 |
+
+
+| 7 |
+Botswana Pula |
+0.287552 |
+
+
+| 8 |
+Belarus Rouble |
+1.45526 |
+
+
+| 9 |
+Canadian Dollar |
+2.82565 |
+
+
+| 10 |
+Swiss Franc |
+3.83311 |
+
+
+| 11 |
+Chilean Peso |
+0.003884 |
+
+
+| 12 |
+Chinese Yuan - Offshore |
+0.536978 |
+
+
+| 13 |
+Chinese Yuan |
+0.538829 |
+
+
+| 14 |
+Colombian Peso |
+0.000832 |
+
+
+| 15 |
+Czech Koruna |
+0.149763 |
+
+
+| 16 |
+Danish Krone |
+0.496304 |
+
+
+| 17 |
+Algerian Dinar |
+0.025944 |
+
+
+| 18 |
+Egypt Pound |
+0.191775 |
+
+
+| 19 |
+Euro |
+3.69096 |
+
+
+| 20 |
+GB Pound |
+4.34256 |
+
+
+| 21 |
+Hongkong Dollar |
+0.468079 |
+
+
+| 22 |
+Hungarian Forint |
+0.009112 |
+
+
+| 23 |
+Indonesia Rupiah |
+0.000248 |
+
+
+| 24 |
+Indian Rupee |
+0.045976 |
+
+
+| 25 |
+Iceland Krona |
+0.026232 |
+
+
+| 26 |
+Jordan Dinar |
+5.17472 |
+
+
+| 27 |
+Japanese Yen |
+0.026818 |
+
+
+| 28 |
+Kenya Shilling |
+0.030681 |
+
+
+| 29 |
+Korean Won |
+0.002746 |
+
+
+| 30 |
+Kuwaiti Dinar |
+11.9423 |
+
+
+| 31 |
+Kazakhstan Tenge |
+0.007704 |
+
+
+| 32 |
+Lebanon Pound |
+0.002418 |
+
+
+| 33 |
+Sri Lanka Rupee |
+0.010201 |
+
+
+| 34 |
+Moroccan Dirham |
+0.353346 |
+
+
+| 35 |
+Macedonia Denar |
+0.059901 |
+
+
+| 36 |
+Mexican Peso |
+0.181874 |
+
+
+| 37 |
+Malaysia Ringgit |
+0.820395 |
+
+
+| 38 |
+Nigerian Naira |
+0.008737 |
+
+
+| 39 |
+Norwegian Krone |
+0.37486 |
+
+
+| 40 |
+NewZealand Dollar |
+2.27287 |
+
+
+| 41 |
+Omani Rial |
+9.53921 |
+
+
+| 42 |
+Peru Sol |
+0.952659 |
+
+
+| 43 |
+Philippine Piso |
+0.065562 |
+
+
+| 44 |
+Pakistan Rupee |
+0.017077 |
+
+
+| 45 |
+Polish Zloty |
+0.777446 |
+
+
+| 46 |
+Qatari Riyal |
+1.00254 |
+
+
+| 47 |
+Serbian Dinar |
+0.031445 |
+
+
+| 48 |
+Russia Rouble |
+0.06178 |
+
+
+| 49 |
+Saudi Riyal |
+0.977847 |
+
+
+| 50 |
+Sudanese Pound |
+0.006479 |
+
+
+| 51 |
+Swedish Krona |
+0.347245 |
+
+
+| 52 |
+Singapore Dollar |
+2.64038 |
+
+
+| 53 |
+Thai Baht |
+0.102612 |
+
+
+| 54 |
+Tunisian Dinar |
+1.1505 |
+
+
+| 55 |
+Turkish Lira |
+0.20272 |
+
+
+| 56 |
+Trin Tob Dollar |
+0.541411 |
+
+
+| 57 |
+Taiwan Dollar |
+0.121961 |
+
+
+| 58 |
+Tanzania Shilling |
+0.001575 |
+
+
+| 59 |
+Uganda Shilling |
+0.000959 |
+
+
+| 60 |
+Vietnam Dong |
+0.000157 |
+
+
+| 61 |
+Yemen Rial |
+0.01468 |
+
+
+| 62 |
+South Africa Rand |
+0.216405 |
+
+
+| 63 |
+Zambian Kwacha |
+0.227752 |
+
+
+| 64 |
+Azerbaijan manat |
+2.16157 |
+
+
+| 65 |
+Bulgarian lev |
+1.8873 |
+
+
+| 66 |
+Croatian kuna |
+0.491344 |
+
+
+| 67 |
+Ethiopian birr |
+0.069656 |
+
+
+| 68 |
+Iraqi dinar |
+0.002516 |
+
+
+| 69 |
+Israeli new shekel |
+1.12309 |
+
+
+| 70 |
+Libyan dinar |
+0.752115 |
+
+
+| 71 |
+Mauritian rupee |
+0.079837 |
+
+
+| 72 |
+Romanian leu |
+0.755612 |
+
+
+| 73 |
+Syrian pound |
+0.001462 |
+
+
+| 74 |
+Turkmen manat |
+1.05079 |
+
+
+| 75 |
+Uzbekistani som |
+0.000336 |
+
+
+
+
",python
+"How to append item to match the length of two list in pythonI am working on a python script which is connected to a server. Every x min, server returns two list but the length of these list is not same. For ex:
+a = [8, 10, 1, 34]
+b = [4, 6, 8]
+
+As you can see above that a is of length 4 and b is of length 3. Simillarly sometime it returns
+a = [3, 6, 4, 5]
+b = [8, 3, 5, 2, 9, 3]
+
+I have to write a logic where I have to check if length of these two list is not same, then add the 0 at the end of the list which is smaller than other list. So for ex, if input is:
+a = [3, 6, 4, 5]
+b = [8, 3, 5, 2, 9, 3]
+
+then output will be:
+a = [3, 6, 4, 5, 0, 0]
+b = [8, 3, 5, 2, 9, 3]
+
+Can anyone please help me with these. Thanks
","def pad(list1, list2):
+ # make copies of the existing lists so that original lists remain intact
+ list1_copy = list1.copy()
+ list2_copy = list2.copy()
+
+ len_list1 = len(list1_copy)
+ len_list2 = len(list2_copy)
+ # find the difference in the element count between the two lists
+ diff = abs(len_list1 - len_list2)
+
+ # add `diff` number of elements to the end of the list
+ if len_list1 < len_list2:
+ list1_copy += [0] * diff
+ elif len_list1 > len_list2:
+ list2_copy += [0] * diff
+
+ return list1_copy, list2_copy
+
+
+a = [3, 6, 4, 5]
+b = [8, 3, 5, 2, 9, 3]
+# prints: ([3, 6, 4, 5, 0, 0], [8, 3, 5, 2, 9, 3])
+print(pad(a, b))
+
+a = [8, 10, 1, 34]
+b = [4, 6, 8]
+# prints: ([8, 10, 1, 34], [4, 6, 8, 0])
+print(pad(a, b))
+
",python
+"Generating a global connection in prisma python clientI'm using Prisma python client for my mysql database. I'm wondering if its possible to do a single global connection instead of having to open and close them whenever I make a query?
+ @commands.has_permissions(manage_guild=True)
+ @commands.slash_command(description="Sets up the server for verification")
+ async def help(self, interaction: disnake.GuildCommandInteraction):
+ await self.client.prisma.connect()
+ response = await self.client.prisma.user.find_first(where={
+ "name": "benn",
+ })
+ print(dict(response))
+ await self.client.prisma.disconnect()
+
+I tried doing the following:
+client.prisma = Prisma(auto_register=True)
+async def connect_to_db():
+ await client.prisma.connect()
+
+asyncio.run(connect_to_db())
+
+However, I get an error:
+disnake.ext.commands.errors.CommandInvokeError: Command raised an exception: RuntimeError: Event loop is closed```
+
","It is recommended that you create one instance of PrismaClient and reuse it across your application and you should only set it to a global variable in the development environment only and you do not need to explicitly $disconnect. You can learn more about Prisma connection management in the docs. Also, I’ll encourage you to ask your Prisma Python client questions in prisma-client-py repositories GitHub Discussion
",python
+"How to Download a File after POSTing data using FastAPI?I am creating a small web application that receives text, converts the text to speech, and returns an mp3 file, which is saved to a temporary directory.
+I want to be able to download the file from the html page which will access the FastAPI server for the downloaded file, but I don't know how to do that properly.
+I know with Flask you can do this with:
+from app import app
+from flask import Flask, send_file, render_template
+
+@app.route('/')
+def upload_form():
+ return render_template('download.html')
+
+@app.route('/download')
+def download_file():
+ path = "html2pdf.pdf"
+
+ return send_file(path, as_attachment=True)
+
+if __name__ == "__main__":
+ app.run()
+
+HTML Example:
+<!doctype html>
+<title>Python Flask File Download Example</title>
+<h2>Download a file</h2>
+<p>
+ <a href="{{ url_for('.download_file') }}">Download</a>
+</p>
+
+So how do I replicate this with FastAPI?
+FastAPI Code:
+from fastapi import FastAPI, File, Request, Response, UploadFile
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import FileResponse, HTMLResponse, StreamingResponse
+from fastapi.templating import Jinja2Templates
+from gtts import gTTS
+
+templates = Jinja2Templates(directory="templates")
+
+
+def text_to_speech(language:str, text: str) -> str:
+ tts = gTTS(text=text, lang=language, slow=False)
+ tts.save("./temp/welcome.mp3")
+ #os.system("mpg321 /temp/welcome.mp3")
+ return "Text to speech conversion successful"
+
+
+@app.get("/")
+def home(request: Request):
+ return templates.TemplateResponse("index.html", {"request": request})
+
+@app.get("/text2speech")
+async def home(request: Request):
+ if request.method == "POST":
+ form = await request.form()
+ if form["message"] and form["language"]:
+ language = form["language"]
+ text = form["message"]
+ translate = text_to_speech(language, text)
+ path = './temp/welcome.mp3'
+ value = FileResponse("./temp/welcome.mp3", media_type="audio/mp3")
+ return value
+ # return templates.TemplateResponse(
+ # "index.html",
+ # {"request": request, "message": text, "language": language, "download": value},
+ # )
+
+
+Sample HTML File:
+
+<!doctype html>
+<title>Download MP3 File</title>
+<h2>Download a file</h2>
+<p>
+ <a href="{{ url_for('text2speech') }}">Download</a>
+</p>
+
","Use the Form keyword to define Form-data in your endpoint, and more specifically, use Form(...) to make a parameter required, instead of using await request.form() and manually checking if the user submitted the required parameters. After processing the received data and generating the audio file, you can use FileResponse to return the file to the user. Note: use the headers argument to set the 'Content-Disposition' header using the attachment parameter—as described in this answer—to have the file downloaded to your device. Failing to set the headers, or using the inline parameter isntead, would lead to 405 Method Not Allowed error, as the browser attempts to access the file using a GET request (however, only POST requests are allowed to the /text2speech endpoint). Have a look at Option 1 in the examples below.
+If you wanted the /text2speech endpoint supporting both GET and POST requests (as shown in your question), you could either use @app.api_route("/text2speech", methods=["GET", "POST"]) and use request.method to check which one has been called, or define two different endpoints e.g., @app.post('/text2speech') and @app.get('/text2speech'). However, you don't necessarily need to do that in this case. Additionally, you have added a Download hyperlink to your template for the user to download the file. However, you haven't provided any information as to how you expect this to work. This wouldn't work in a scenario where you don't have static files, but dynamically generated audio files (as in your case), as well as multiple users accessing the API at the same time; unless, for example, you generated random UUIDs for the filenames and saved the files in a StaticFiles directory—or added that unique identifier as a query/path parameter (you could also use cookies instead, see here and here) to the URL in order to identify the file to be downloaded—and sent the URL back to the user. In that case, you would need a Javascript interface/library, such as Fetch API, to make an asynchronous HTTP request—as described in this answer—in order to get the URL to the file and display it in the Download hyperlink. Have a look at Option 2 below. Note: The example in Option 2 uses a simple dict to map the filepaths to UUIDs, for demo purposes. In a real-world scenario, where multiple users access the API and several workers might be used, you may consider using a database storage, or Key-Value stores (Caches), as described here and here. You would also need to have a mechanism for deleting the files from the database and disk, once they have been downloaded, as well as make sure that users do not have unauthorised access to other users' audio files.
+Option 1
+app.py
+from fastapi import FastAPI, Request, Form
+from fastapi.templating import Jinja2Templates
+from fastapi.responses import FileResponse
+import os
+
+app = FastAPI()
+templates = Jinja2Templates(directory="templates")
+
+@app.get('/')
+def main(request: Request):
+ return templates.TemplateResponse("index.html", {"request": request})
+
+@app.post('/text2speech')
+def convert(request: Request, message: str = Form(...), language: str = Form(...)):
+ # do some processing here
+ filepath = './temp/welcome.mp3'
+ filename = os.path.basename(filepath)
+ headers = {'Content-Disposition': f'attachment; filename="{filename}"'}
+ return FileResponse(filepath, headers=headers, media_type="audio/mp3")
+
+An alternative to the above would be to read the file data inside your endpoint (or if the data were already fully loaded into memory, such as here, here and here) and return a custom Response directly, as shown below:
+from fastapi import Response
+
+@app.post('/text2speech')
+ ...
+ with open(filepath, "rb") as f:
+ contents = f.read()
+
+ headers = {'Content-Disposition': f'attachment; filename="{filename}"'}
+ return Response(contents, headers=headers, media_type='audio/mp3')
+
+In case you had to return a file that is too large to fit into memory—e.g., if you have 8GB of RAM, you can’t load a 50GB file—you could use StreamingResponse, which would load the file into memory in chunks and process the data one chunk at a time:
+from fastapi.responses import StreamingResponse
+
+@app.post('/text2speech')
+ ...
+ def iterfile():
+ with open(filepath, "rb") as f:
+ yield from f
+
+ headers = {'Content-Disposition': f'attachment; filename="{filename}"'}
+ return StreamingResponse(iterfile(), headers=headers, media_type="audio/mp3")
+
+templates/index.html
+<!DOCTYPE html>
+<html>
+ <head>
+ <title>Convert Text to Speech</title>
+ </head>
+ <body>
+ <form method="post" action="http://127.0.0.1:8000/text2speech">
+ message : <input type="text" name="message" value="This is a sample message"><br>
+ language : <input type="text" name="language" value="en"><br>
+ <input type="submit" value="submit">
+ </form>
+ </body>
+</html>
+
+Option 2
+app.py
+from fastapi import FastAPI, Request, Form
+from fastapi.templating import Jinja2Templates
+from fastapi.responses import FileResponse
+import uuid
+import os
+
+app = FastAPI()
+templates = Jinja2Templates(directory="templates")
+
+files = {}
+
+@app.get('/')
+def main(request: Request):
+ return templates.TemplateResponse("index.html", {"request": request})
+
+@app.get('/download')
+def download_file(request: Request, fileId: str):
+ filepath = files.get(fileId)
+ if filepath:
+ filename = os.path.basename(filepath)
+ headers = {'Content-Disposition': f'attachment; filename="{filename}"'}
+ return FileResponse(filepath, headers=headers, media_type='audio/mp3')
+
+@app.post('/text2speech')
+def convert(request: Request, message: str = Form(...), language: str = Form(...)):
+ # do some processing here
+ filepath = './temp/welcome.mp3'
+ file_id = str(uuid.uuid4())
+ files[file_id] = filepath
+ file_url = f'/download?fileId={file_id}'
+ return {"fileURL": file_url}
+
+templates/index.html
+<!DOCTYPE html>
+<html>
+ <head>
+ <title>Convert Text to Speech</title>
+ </head>
+ <body>
+ <form method="post" id="myForm">
+ message : <input type="text" name="message" value="This is a sample message"><br>
+ language : <input type="text" name="language" value="en"><br>
+ <input type="button" value="Submit" onclick="submitForm()">
+ </form>
+
+ <a id="downloadLink" href=""></a>
+
+ <script type="text/javascript">
+ function submitForm() {
+ var formElement = document.getElementById('myForm');
+ var data = new FormData(formElement);
+ fetch('/text2speech', {
+ method: 'POST',
+ body: data,
+ })
+ .then(response => response.json())
+ .then(data => {
+ document.getElementById("downloadLink").href = data.fileURL;
+ document.getElementById("downloadLink").innerHTML = "Download";
+ })
+ .catch(error => {
+ console.error(error);
+ });
+ }
+ </script>
+ </body>
+</html>
+
+Removing a File after it has been downloaded
+To remove a file after it has been downloaded by the user, you can simply define a BackgroundTask to be run after returning the response. For example, for Option 1 above:
+from fastapi import BackgroundTasks
+import os
+
+@app.post('/text2speech')
+def convert(request: Request, background_tasks: BackgroundTasks, ...):
+ filepath = 'welcome.mp3'
+ # ...
+ background_tasks.add_task(os.remove, path=filepath)
+ return FileResponse(filepath, headers=headers, media_type="audio/mp3")
+
+For Option 2, however, you would have to make sure to delete the key (i.e., file_id) pointing to the given filepath from the cache as well. Hence, you should create a task function, as shown below:
+from fastapi import BackgroundTasks
+import os
+
+files = {}
+
+def remove_file(filepath, fileId):
+ os.remove(filepath)
+ del files[fileId]
+
+@app.get('/download')
+def download_file(request: Request, fileId: str, background_tasks: BackgroundTasks):
+ filepath = files.get(fileId)
+ if filepath:
+ # ...
+ background_tasks.add_task(remove_file, filepath=filepath, fileId=fileId)
+ return FileResponse(filepath, headers=headers, media_type='audio/mp3')
+
",python
+"Average price of scraped item on ebay using pythonHow can I get the average price from a list of scraped items' prices from ebay?
+This is my code:
+from urllib.request import Request, urlopen
+from bs4 import BeautifulSoup
+import requests
+from requests_html import HTMLSession
+
+link = "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=p2334524.m570.l1313&_nkw=5035224123933&_sacat=0&LH_TitleDesc=0&_odkw=EAN5035224123933&_osacat=0"
+
+req = Request(link, headers={'User-Agent': 'Mozilla/5.0'})
+webpage = urlopen(req).read()
+with requests.Session() as c:
+
+ soup = BeautifulSoup(webpage, 'html5lib')
+ lists = soup.find_all('li', class_="s-item s-item__pl-on-bottom s-item--watch-at-corner")
+ for list in(lists):
+ price=float(list.find('span', class_="s-item__price").text.replace('£',''))
+ avg = sum(price)/len(price)
+
+I've tried:
+avg = sum(price)/len(price)
+
+But it gives an error:
+TypeError: 'float' object is not iterable
+
","Assuming that the rest of your code works correctly, and retrieves the prices you need, the problem is with this:
+ for list in(lists):
+ price=float(list.find('span', class_="s-item__price").text.replace('£',''))
+ avg = sum(price)/len(price)
+
+You say price=float(..) - so yes, price is a floating point number and thus trying to sum() and len() it on the next line doesn't make sense to Python. You probably wanted to put all those prices in a list (e.g. prices) and then compute sum(prices) / len(prices)
+Something like:
+ prices = []
+ for list in lists:
+ prices.append(float(list.find('span', class_="s-item__price").text.replace('£','')))
+ avg = sum(prices) / len(prices)
+
+To understand why you got that error, consider:
+>>> sum(1.0)
+Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+TypeError: 'float' object is not iterable
+>>> len(1.0)
+Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+TypeError: object of type 'float' has no len()
+
+So, you see those operations don't work on invidual values, they work on an iterable with a length (like a list). Another clue about your original code is that the computation of the average was inside the loop (indented), but you only want to compute the average once, not for every price.
",python
+"Fetching data using React to a Python backendI am using python (bottle framework) as backend and React as frontend. I want to fetch data from React through "http://127.0.0.1:8080/" using useEffect Hook. The problem is I keep getting:
+![]()
+Here is my backend (python)
+@get('/')
+def _():
+ data = {
+ 'name': 'Alvin',
+ 'lastname': 'Johnson'
+ }
+ return json.dumps(data)
+
+And here is the frontend (react)
+import React, { useEffect, useState } from 'react';
+
+import './App.css';
+
+function App() {
+
+ const [data, setdata] = useState({
+ name: '',
+ lastname: ''
+ });
+
+ useEffect(() => {
+ fetch('/').then((res) =>
+ res.json()).then((data) => {
+ console.log(data);
+ setdata({
+ name: data.name,
+ lastname: data.lastname
+ });
+ })
+ }, []);
+
+ return (
+ <div className="App">
+ <h1>Welcome to Python-React app</h1>
+ <p>{data.name}</p>
+ <p>{data.lastname}</p>
+ </div>
+ );
+}
+
+export default App;
+
+I have also added "proxy" on package.json file
+"proxy":"http://127.0.0.1:8080/",
+
+Is there anything I am doing wrong? Thank you in advance.
","consider using axios, thats what I use for my Flask backend.
+import axios from 'axios';
+import React, { useEffect, useState } from 'react';
+import './App.css';
+
+function App() {
+
+ const [data, setdata] = useState({
+ name: '',
+ lastname: ''
+ });
+
+ const fetchData = async () => {
+ const response = await axios.get(`http://127.0.0.1:8080/`)
+ setdata(response.data);
+ }
+
+ useEffect(() => {
+ fetchData();
+ }, [])
+
+ return (
+ <div className="App">
+ <h1>Welcome to Python-React app</h1>
+ <p>{data.name}</p>
+ <p>{data.lastname}</p>
+ </div>
+ );
+}
+
+export default App;
+
",python
+"How to get text data from a single tag without comma seperator using scrapybelow is the html snippet
+<P class="subtitulo">
+ <b>
+ <a name="Editores"> Editorial </a>
+ "assistant"
+ </b>
+</p>
+
+by using this scrapy code
+response.css("p.subtitulo *::text").extract()
+
+I get
+
+['Editorial', ' Assistant']
+
+response.css("p.subtitulo *::text").get()
+
+I get only "
+
+Assistant
+
+"
+I want the full string without any commas like
+
+"Editorial Assistant"
+
+Using Beautiful soup I am getting the text without comma. But how to do it with Scrapy. Since I have other roles separated by commas I don't want to use split().
+This is the page url
+http://www.scielo.org.co/revistas/zop/iedboard.htm
","You can do that by invoking .join() and .getall() method as follows:
+import scrapy
+class TestSpider(scrapy.Spider):
+ name = 'test'
+ start_urls = ['http://www.scielo.org.co/revistas/zop/iedboard.htm']
+
+ def parse(self, response):
+ for p in response.css('.subtitulo')[1:]:
+ yield {
+ 'Name': ''.join(p.css("::text").getall())
+ }
+
+Output:
+{'Name': 'Editorial Assistant'}
+2022-08-08 15:39:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://www.scielo.org.co/revistas/zop/iedboard.htm>
+{'Name': 'Editorial Committee '}
+2022-08-08 15:39:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://www.scielo.org.co/revistas/zop/iedboard.htm>
+{'Name': 'Scientific Committee'}
+2022-08-08 15:39:03 [scrapy.core.scraper] DEBUG: Scraped from <200 http://www.scielo.org.co/revistas/zop/iedboard.htm>
+{'Name': 'Editorial Universidad Del Norte'}
+
",python
+"Parsing unusual data (chess data) in a dataframe columnI'm writing a function to put data from a chess website into a dataframe.
+The code I was able to write almost got the job done but not quite.
+import chessdotcom
+import pandas as pd
+import regex as re
+import json
+from io import StringIO
+#
+def cleandata(datacall):
+ data = datacall.text
+ x = json.loads(data)
+ df = pd.read_json(StringIO(json.dumps(x)))
+ df2 = pd.json_normalize(df['games'])
+ df2.to_csv(r'chessdataframegamesv3.csv')
+#
+datacall = chessdotcom.client.get_player_games_by_month("Player1", 2022, 7)
+cleandata(datacall)
+
+This produced a csv file that had the columns url, pgn, time_control, etc.
+The pgn column is the one that I'm having trouble working with.
+It has a lot of information for each row. Just one row has Event: Live Chess, Site "Chess.com", Date "2022.07.01", Round "-", White "Player2",Black "Player1",Result "1-0",CurrentPosition "6Q1/5K1k/8/2p5/1p6/p7/1PN5/8 b - -",Timezone "UTC", ECO "B18", ECOUrl "https://www.chess.com/openings/Caro-Kann-Defense-Classical-Variation-5.Ng3-Bg6-6.Nf3-Nd7-7.Bd3", UTCDate "2022.07.01", UTCTime "13:37:42", WhiteElo "1611", BlackElo "1616", TimeControl "180+2", Termination "Player2 won on time", StartTime "13:37:42", EndDate "2022.07.01", EndTime "13:47:01", Link "https://www.chess.com/game/live/50421603881". These are each in square brackets. These are also followed by the moves played in the game
+Ideally, this information should for the most part be separate columns. So there should be a column "Event" with value "Live Chess" for example. Also, it would be good to have a separate row for each move, in a "moves" column. However, having all the moves in the moves column would also be acceptable. Does anyone know how to do this? This is my first question here. So I hope I'm clear enough. Thanks.
","I suggest to use python-chess for pgn parsing:
+pip install chess
+
+Then you can use it as follows:
+import json
+from io import StringIO
+
+import chess.pgn
+from chessdotcom import client
+import pandas as pd
+
+def read_pgn(pgn_str):
+ game = chess.pgn.read_game(StringIO(pgn_str))
+ headers = game.headers
+ headers['moves'] = game.mainline_moves()
+ return headers
+
+# "erik" is taken as an example from here: https://www.chess.com/news/view/published-data-api
+player, year, month = "erik", 2022, 7
+r = client.get_player_games_by_month(player, year, month)
+
+games = json.loads(r.text)['games']
+df1 = pd.json_normalize(games)
+df2 = pd.DataFrame(list(df1.pgn.apply(read_pgn)))
+pd.concat([df1, df2], axis=1).drop(columns='pgn')
+
+This will give you the following dataframe:
+ url time_control end_time rated ... Variant White WhiteElo moves
+0 https://www.chess.com/game/daily/410529597 1/259200 1656675306 False ... NaN TheMsquare 1592 3. Nc3 { [%clk 69:37:31] } 3... Bg7 { [%clk 71...
+1 https://www.chess.com/game/daily/387673189 1/604800 1656675605 True ... Chess960 erik 1434 1. b3 { [%clk 167:59:49] } 1... Ne6 { [%clk 16...
+2 https://www.chess.com/game/daily/410866161 1/259200 1656679900 True ... NaN Lee 1356 1. e4 { [%clk 62:04:50] } 1... d6 { [%clk 71:3...
+3 https://www.chess.com/game/live/50449822647 60 1656711104 True ... NaN rexzs 1744 1. e3 { [%clk 0:01:00] } 1... d6 { [%clk 0:01:...
+4 https://www.chess.com/game/live/50450421549 60 1656711671 True ... NaN erik 1823 1. e4 { [%clk 0:01:00] } 1... c6 { [%clk 0:01:...
+.. ... ... ... ... ... ... ... ... ...
+214 https://www.chess.com/game/live/52974588839 60 1659235731 True ... NaN erik 1741 1. e4 { [%clk 0:01:00] } 1... e6 { [%clk 0:01:...
+215 https://www.chess.com/game/live/52975772965 60 1659236887 True ... NaN GeneralCoin 1672 1. d4 { [%clk 0:01:00] } 1... d6 { [%clk 0:01:...
+216 https://www.chess.com/game/live/52976977887 60 1659238132 True ... NaN MarcosP99 1772 1. e4 { [%clk 0:01:00] } 1... d6 { [%clk 0:01:...
+217 https://www.chess.com/game/live/52977007497 60 1659238338 True ... NaN erik 1730 1. e4 { [%clk 0:01:00] } 1... c5 { [%clk 0:01:...
+218 https://www.chess.com/game/live/53060949585 60 1659321932 True ... NaN erik 1724 1. e4 { [%clk 0:01:00] } 1... c5 { [%clk 0:01:...
+
+[219 rows x 50 columns]
+
",python
+"tkinter multiple buttons invoke same function, how to determine the clicked one?I used for loop to render 15 buttons, and each button invokes the same function to do something. My question is how to determine which button is clicked?
+My code snippet is like below:
+for number in range(1, 16):
+ ttk.Button(bottom_frame, text='Read', command=read_one).grid(column=4, row=number, padx=5, pady=5)
+
+I want to reuse the function read_one() for every button, but don't know how to determine which button is clicked. Any comment is appreciated!
+Here's my test code: https://pastebin.com/fWyyNVw7
","Since the command callback doesn't get passed any parameters by default (like the calling control), there's no easy option.
+However, you could use something like this:
+for number in range(1, 16):
+ ttk.Button(bottom_frame, text='Read', command=lambda number=number: read_one(number)
+ ).grid(column=4, row=number, padx=5, pady=5)
+
+That way, read_one will be called with the number passed. Note that you'd need your read_one function to deal with it, e.g.:
+def read_one(number):
+ # do something with number, since it tells you what button was pushed
+ ...
+
",python
+"How do I compare two lists of pairs to see which pair combinations exist in both?Here is a simple example of what I am trying to do. So with two lists of pairs, such as:
+pairs1 = [(egg,dog),(apple,banana),(orange,chocolate),(elephant,gargoyle),(cat,lizard)]
+pairs2 = [(cat,lizard),(ice,hamster),(elephant,giraffe),(apple,gargoyle),(dog,egg)]
+
+I want to be able to retrieve the pair combinations that the two lists have in common. So for these two lists, the pairs retrieved would be (cat,lizard) and (dog,egg). The order of the elements within in the pair don't matter, just the fact that the pair combination is within the same tuple.
","Try:
+pairs1 = [
+ ("egg", "dog"),
+ ("apple", "banana"),
+ ("orange", "chocolate"),
+ ("elephant", "gargoyle"),
+ ("cat", "lizard"),
+]
+pairs2 = [
+ ("cat", "lizard"),
+ ("ice", "hamster"),
+ ("elephant", "giraffe"),
+ ("apple", "gargoyle"),
+ ("dog", "egg"),
+]
+
+x = set(map(frozenset, pairs1)).intersection(map(frozenset, pairs2))
+print(list(map(tuple, x)))
+
+Prints:
+[('lizard', 'cat'), ('egg', 'dog')]
+
",python
+"Camera is not Detecting the face and showing detailsI am getting a type error in my Python Code
+Code:
+my_cursor.execute("select Gender from student where Id=" + str(id))
+ g = my_cursor.fetchone()
+ g = "+".join(g)
+
+error:
+
+n="+".join(n)
+
+
+TypeError: can only join an iterable
+
","you can use my_cursor as an iterable result
+my_cursor.execute("select Gender from student where Id=" + str(id))
+result = "+".join(my_cursor)
+
+You can see more in here
",python
+"Shell - Pass env variable to Python ScriptI have the shell script where I create a Python file on the fly:
+#!/bin/bash
+
+args=("$@")
+
+GIT_PASSWORD=${args[0]}
+export $GIT_PASSWORD
+
+python - << EOF
+
+import os
+
+print(os.environ.get("GIT_PASSWORD"))
+
+EOF
+
+echo $GIT_PASSWORD
+
+echo "Back to bash"
+
+I want to be able to access the variable GIT_PASSWORD, but unfortunately, I am not able to pass it to the python file.
+Does anyone know what I am doing wrong and how I may fix that?
","The thing is that you're not actually setting an env variable, you need to change the export:
+export GIT_PASSWORD=$GIT_PASSWORD
+
+please do read the comment interaction below
",python
+"Change a variable given in a functionI want to make a function that checks if a random number is within a certain value and if it is then reroll the number for that variable.
+But I don't know how to set the input variable without calling its name directly.
+code:
+def checkVal(value,max,min):
+ if value<max or value>min:
+ value=random.randrange((-30/10),(30/10))
+ checkVal(value,min,max)
+ else:
+ pass
+
","You can't charge a variable from within a function without using a global, however this wouldn't be good practice here. Rather return your value.
+Also your test would always be True if max>min, you probably meant to swap the conditions (lower > value or value > upper).
+def checkVal(value, upper, lower):
+ if lower > value or value > upper):
+ value = random.randrange((-30/10),(30/10))
+ return checkVal(value, upper, lower)
+ return value
+
+value = checkVal(value, upper, lower)
+
+From an algorithmic point of view, using a recursive function as a loop is also not so good practice. Rather use a while loop:
+def checkVal(value, upper, lower):
+ while lower > value or value > upper:
+ value = random.randrange((-30/10),(30/10))
+ return value
+
+value = checkVal(value, upper, lower)
+
+Finally, you can probably remove the loop entirely by choosing directly the correct bounds:
+def checkVal(value, upper, lower):
+ if lower > value or value > upper:
+ value = random.randrange(max(lower, (-30/10)), min(upper, (30/10)))
+ return value
+
+value = checkVal(value, upper, lower)
+
",python
+"How to compute the kind of distance matrix with vectorizationI have an numpy array A of shape 4 X 3 X 2. Each line below is a 2D coordinate of a node. (Each three nodes compose a triangle in my finite element analysis.)
+array([[[0., 2.], #node00
+ [2., 2.], #node01
+ [1., 1.]], #node02
+
+ [[0., 2.], #node10
+ [1., 1.], #node11
+ [0., 0.]], #node12
+
+ [[2., 2.], #node20
+ [1., 1.], #node21
+ [2., 0.]], #node22
+
+ [[0., 0.], #node30
+ [1., 1.], #node31
+ [2., 0.]]]) #node32
+
+I have another numpy array B of coordinates of pre-computed "centers":
+array([[1. , 1.66666667], # center0
+ [0.33333333, 1. ], # center1
+ [1.66666667, 1. ], # center2
+ [1. , 0.33333333]])# center3
+
+How can I efficiently calculate a matrix C of Euclidian distance like this
+dist(center0, node00) dist(center0,node01) dist(center0, node02)
+dist(center1, node10) dist(center1,node11) dist(center1, node12)
+dist(center2, node20) dist(center2,node21) dist(center2, node22)
+dist(center3, node30) dist(center3,node31) dist(center3, node32)
+
+where dist represents a Euclidian distance formula like math.dist or numpy.linalg.norm? Namely, the result matrix's i,j element is the distance between center-i to node-ij.
+Vectorized code instead of loops is needed, as my actual data is from medical imaging which is very large. With a nested loop, one can obtain the expected output as follows:
+In [63]: for i in range(4):
+ ...: for j in range(3):
+ ...: C[i,j]=math.dist(A[i,j], B[i])
+
+In [67]: C
+Out[67]:
+array([[1.05409255, 1.05409255, 0.66666667],
+ [1.05409255, 0.66666667, 1.05409255],
+ [1.05409255, 0.66666667, 1.05409255],
+ [1.05409255, 0.66666667, 1.05409255]])
+
+[Edit] This is different question from Pairwise operations (distance) on two lists in numpy, as things like indexing needs to be properly addressed here.
","a = np.reshape(A, [12, 2])
+b = B[np.repeat(np.arange(4), 3)]
+c = np.reshape(np.linalg.norm(a - b, axis=-1), (4, 3))
+c
+# array([[1.05409255, 1.05409255, 0.66666667],
+# [1.05409255, 0.66666667, 1.05409255],
+# [1.05409255, 0.66666667, 1.05409255],
+# [1.05409255, 0.66666667, 1.05409255]])
+
",python
+"How does python handle list unpacking, redefinition, and reference?I am new to python and am trying to understand how it handles copies vs references in respect to list unpacking. I have a simple code snippet and am looking for an explanation as to why it is behaving the way it does.
+arr = [1, 2, 3, 4]
+[one, two, three, four] = arr
+print(id(arr[0]), arr[0])
+print(id(one), one)
+one = 5
+print(id(one), one)
+
+The output is:
+(16274840, 1)
+(16274840, 1)
+(16274744, 5)
+
+I am not sure why one is all the sudden moved to a different memory location when I try to modify its contents.
+I am using python version 2.7.18.
+This is my first post, so I apologize in advance if I am not adhering to the guidelines. Please let me know if I have violated them.
+Thank you for all the responses. They have helped me boil down my misunderstanding to this code:
+var = 1
+print(id(var), var)
+var = 5
+print(id(var), var)
+
+With output:
+(38073752, 1)
+(38073656, 5)
+
+Asking about lists and unpacking them was completely obfuscatory.
+This does a great job of explaining:
+http://web.stanford.edu/class/archive/cs/cs106a/cs106a.1212/handouts/mutation.html
","The id/address is not associated with the variable/name; it's associated with the data that the variable is referring to.
+The 1 object is, in this instance, at address 16274840, and the 5 object is at address 16274744. one = 5 causes one to now refer to the 5 object which is at location 16274744.
+
+Just to rephrase this in terms of C, I think your question essentially boils down to "why does the following not modify the first element of arr?" (I'm ignoring unpacking since it isn't actually relevant to the question):
+arr = [1, 2, 3, 4]
+one = arr[0]
+one = 5
+
+I would approximate that code to the following C which also does not modify arr:
+int internedFive = 5;
+
+int arr[4] = {1, 2, 3, 4};
+
+int* one = &arr[0];
+one = &internedFive;
+
+printf("%d", arr[0]); // Prints 1
+
+one originally pointed to the first element of arr, but was reassigned to point to the 5. This reassignment of the pointer has no effect on the data location originally pointed to by one and arr[0].
",python
+"Finding Value between two numbers in pandas dataframeI have two pandas dataframe "A" and "B". I would like to find out row number from "B" where value of "A" lies in between two numbers of "B" data frame.
+Table A
+
+
+
+
+| Index |
+0 |
+
+
+
+
+| 0 |
+0.084 |
+
+
+| 1 |
+0.169 |
+
+
+| 2 |
+0.252 |
+
+
+| 3 |
+0.337 |
+
+
+| 4 |
+0.419 |
+
+
+| 5 |
+0.504 |
+
+
+| 6 |
+0.589 |
+
+
+
+
+Table B
+
+
+
+
+| Index |
+0 |
+
+
+
+
+| 0 |
+0.071 |
+
+
+| 1 |
+0.167 |
+
+
+| 2 |
+0.244 |
+
+
+| 3 |
+0.320 |
+
+
+
+
+In the case of the above tables let's take one example. The First Number from Table "A" is 0.084 it's Actually in between Table B 0 & 1 Index value i.e. 0.071 and 0.167. I am looking out for an output as [0,1] which is basically row numbers of two values.
","First initialize empty array for result:
+res = [[]] * len(A.iloc[:, 0])
+
+Then we implement nested loop through A and B, and check each value in A is between B values and return the index
+The condition return the start index only:
+(A.iloc[:, 0][i] > B.iloc[:, 0][j]) & (A.iloc[:, 0][i] < B.iloc[:, 0][j+1])
+
+So I get the value and add 1 to it later and add them to a list:
+res[i]=([j , j+1])
+
+The full code:
+import pandas as pd
+
+A = [0.084, 0.169, 0.252, 0.337, 0.419, 0.504, 0.589]
+B = [0.071, 0.167, 0.244, 0.320]
+
+A = pd.DataFrame(A)
+B = pd.DataFrame(B)
+
+res = [[]] * len(A.iloc[:, 0])
+
+for i in range(0, len(A.iloc[:, 0])):
+ for j in range(0, len(B.iloc[:, 0])-1):
+ if (A.iloc[:, 0][i] > B.iloc[:, 0][j]) & (A.iloc[:, 0][i] < B.iloc[:, 0][j+1]):
+ res[i]=([j , j+1])
+
+print(res)
+
+The output:
+![]()
+Note: I assume that B is always sorted in ascending order
",python
+"I get an [AttributeError: module 'code' has no attribute 'Main'] error, but the attribute is clearly there. Why?I'm trying to link a Discord bot to a text-game I made, but when I attempt to call the Class of the game itself, it tells me,
+AttributeError: module 'code' has no attribute 'Main'
+
+Here's my code:
+# bot.py
+import code # Import the code for the actual game
+
+main = code.Main() # Begin the game's processes
+
+# code.py
+class Main:
+
+ def __init__(self):
+ self.otherModule() # This module is used to continue the flow throughout the class
+
+I can't see what's wrong with it. When I try to look it up, I'm only told to "just give it a class."
","Rename code. The module code already exists as a built-in module in Python3.
+Python 3.10.4
+Type "help", "copyright", "credits" or "license" for more information.
+>>> import code
+>>> print(code)
+<module 'code' from '/usr/lib/python3.10/code.py'>
+
+Source: https://docs.python.org/3/library/code.html#module-code
",python
+"Django Generate cookie before loading pageI'm building an e-commerce website and I'm generating device cookie to store unauthorized users' UUID. If user is not authorized it searches for their cookie and to display quantity of items in cart for given device (at navbar.html). When I run any page I get a KeyError: 'device' as it tries to call function that searches for a cookie which doesn't exist yet. Is there any workaround?
+base.html
+<html lang="en">
+ <head>
+ <script>
+
+ function getCookie(name) {
+ var cookieValue = null;
+ if (document.cookie && document.cookie !== '') {
+ var cookies = document.cookie.split(';');
+ for (var i = 0; i < cookies.length; i++) {
+ var cookie = cookies[i].trim();
+ // Does this cookie string begin with the name we want?
+ if (cookie.substring(0, name.length + 1) === (name + '=')) {
+ cookieValue = decodeURIComponent(cookie.substring(name.length + 1));
+ break;
+ }
+ }
+ }
+ return cookieValue;
+ }
+
+ let device = getCookie('device');
+
+ function uuidv4() {
+ return ([1e7]+-1e3+-4e3+-8e3+-1e11).replace(/[018]/g, c =>
+ (c ^ crypto.getRandomValues(new Uint8Array(1))[0] & 15 >> c / 4).toString(16)
+ );
+ }
+
+ if (device == null || device == undefined) {
+ device = uuidv4();
+ }
+
+ document.cookie = 'device=' + device + ';domain=;path=/'
+ </script>
+ </head>
+ {% include "navbar.html" %}
+</html>
+
+navbar.html
+<span class="badge red z-depth-1 mr-1"> {{ request|cart_item_count }} </span>
+
+cart_tag.py
+from django import template
+from store.models import Order, Customer
+register = template.Library()
+
+@register.filter
+def cart_item_count(request):
+ try:
+ customer = request.user.customer
+ except AttributeError:
+ device = request.COOKIES['device']
+ customer, _ = Customer.objects.get_or_create(device=device)
+ qs = Order.objects.filter(customer=customer, ordered=False)
+ if qs.exists():
+ return qs[0].items.count()
+ return 0
+
+
+EDIT:
+views.py
+from django.views.generic import ListView
+from .models import ShopItem
+class HomeView(ListView):
+ model = ShopItem
+ paginate_by = 4
+ template_name = "home-page.html"
+
+models.py
+from django.db import models
+class ShopItem(models.Model):
+ id = models.AutoField(primary_key=True)
+ title = models.CharField(max_length=100)
+ description = models.TextField()
+
","I think you should change your code like this maybe in cart_tag.py for initial None device cookie.
+from django import template
+from store.models import Order, Customer
+
+register = template.Library()
+
+
+@register.filter
+def cart_item_count(request):
+ customer = None
+ try:
+ customer = request.user.customer
+ except AttributeError:
+ device = request.COOKIES.get('device', '')
+ if device:
+ customer, _ = Customer.objects.get_or_create(device=device)
+ if customer:
+ qs = Order.objects.filter(customer=customer, ordered=False)
+ if qs.exists():
+ return qs[0].items.count()
+ return 0
+
",python
+"Disagreement in confusion matrix and accuracy when using data generatorI was working on a model
+based on the following code
+epoch=100
+model_history = model.fit(train_generator,
+epochs=epoch,
+validation_data=test_generator,
+callbacks=[model_es, model_rlr, model_mcp])
+
+After model training when I evaluated the model using the following code, I get an accuracy of 98.3%
+model.evaluate(test_generator)
+
+41/41 [==============================] - 3s 68ms/step - loss: 0.0396 - accuracy: 0.9893
+[0.039571091532707214, 0.9893211126327515]
+In order to analyse the result, I tried to obtain a confusion matrix of the test_generator using the following code
+y_pred = model.predict(test_generator)
+y_pred = np.argmax(y_pred, axis=1)
+print(confusion_matrix(test_generator.classes, y_pred))
+
+However the output is
+[[ 68 66 93 73]
+ [ 64 65 93 84]
+ [ 91 102 126 86]
+ [ 69 75 96 60]]
+
+which highly disagrees with the model_evaluate
+Can anyone help me out here to obtain the actual confusion matrix for the model
+plot history of model accuracy
+Entire code: https://colab.research.google.com/drive/1wpoPjnSoCqVaA--N04dcUG6A5NEVcufk?usp=sharing
","From your code, change:
+test_generator=train_datagen.flow_from_directory(
+ locat_testing,
+ class_mode='binary',
+ color_mode='grayscale',
+ batch_size=32,
+ target_size=(img_size,img_size)
+)
+
+To include the shuffle parameter:
+test_generator=train_datagen.flow_from_directory(
+ locat_testing,
+ class_mode='binary',
+ color_mode='grayscale',
+ batch_size=32,
+ target_size=(img_size,img_size),
+ shuffle=False
+)
+
+Your confusion matrix will look a lot more accurate instead of what looks like randomly guessing.
",python
+"iteratively retrieving information contained in different rows in pandas dataframeI have a dataframe which contains post and comments.
+Every comment has an id and a parent id (identifies the comment or post which the comment was a response to).
+Posts only have an id, since they don't answer to anything.
+
+
+
+
+| submission |
+id |
+parent id |
+
+
+
+
+| post1 |
+1 |
+ |
+
+
+| comment1 |
+2 |
+1 |
+
+
+| comment2 |
+3 |
+1 |
+
+
+| comment3 |
+4 |
+2 |
+
+
+| comment4 |
+5 |
+4 |
+
+
+| post2 |
+6 |
+ |
+
+
+| comment5 |
+7 |
+6 |
+
+
+
+
+I would like to retrieve the id of the original post for every comment and obtain something like that:
+
+
+
+
+| submission |
+id |
+parent id |
+ancestor id |
+
+
+
+
+| post1 |
+1 |
+ |
+ |
+
+
+| comment1 |
+2 |
+1 |
+1 |
+
+
+| comment2 |
+3 |
+1 |
+1 |
+
+
+| comment3 |
+4 |
+2 |
+1 |
+
+
+| comment4 |
+5 |
+4 |
+1 |
+
+
+| post2 |
+6 |
+ |
+ |
+
+
+| comment5 |
+7 |
+6 |
+2 |
+
+
+
+
+to do so I tried to loop from the end of the dataframe to the beginning, iteratively tracing back the parent_id of the parent_id until I found an empty parent_id cell.
+On the test dataframe it works, but on the main one is too slow. Is there a way to make it more efficient?
+Here my original code:
+#creating a column for the id of the original post
+df["ancestor"] = df.id
+#obtaining the id of the original post for every comment
+for i in reversed(range(len(df.id))): #looping trough the comments
+ id = df["parent_id"][i] #variable to initialize the future loop
+ parent = id
+ while parent != "": #only looping trough comments
+ df.ancestor[i] = id
+ parent = df.parent_id[df.id == id].values[0]
+ id = parent
+
","I'm not sure if this speeds things up, but using the NetworkX library might be worth a try:
+import networkx as nx
+
+G = nx.from_pandas_edgelist(
+ df[df["parent_id"].ne("")], source="parent_id", target="id"
+)
+
+roots = set(df.loc[df["parent_id"].eq(""), "id"])
+mapping = {}
+for comp in nx.connected_components(G):
+ root = (roots & comp).pop()
+ roots.discard(root)
+ mapping.update(dict.fromkeys(comp - {root}, root))
+
+df["ancestor_id"] = df["id"].map(mapping)
+
+
+- First read the
parent_id-id combos as edges in a graph G.
+- Identify the root nodes
roots of the trees (G should be a forrest, from what I understand).
+- Then build a
mapping nodes -> root (via the components of G), and apply it to the column id.
+
+Result for the example:
+ submission id parent_id ancestor_id
+0 post1 1 NaN
+1 comment1 2 1 1
+2 comment2 3 1 1
+3 comment3 4 2 1
+4 comment4 5 4 1
+5 post2 6 NaN
+6 comment5 7 6 6
+
",python
+"Python BeautifulSoup finding table and parsing itThis one is an odd one I ran this code in the morning and it worked just fine on the html from page. Now when I run it the tables variable comes back 0 items so the for loop never happens and no data is collected or data frame created
+def parseForclosure(pagesource):
+ data = []
+ soup = BeautifulSoup(pagesource,'html.parser')
+ tables = soup.find_all('table', attrs={'class':'ad_tab'})
+ print(len(tables))
+ df2 = pd.DataFrame()
+ for i in range(len(tables)):
+ print(i)
+ table_body = tables[i].find('tbody')
+
+ rows = table_body.find_all('tr')
+ for row in rows:
+ cols = row.find_all('td')
+ cols = [ele.text.strip() for ele in cols]
+ data.append([ele for ele in cols if ele])
+
+ data2 ={'AuctionType': [data[0]] ,
+ 'CaseNo': [data[1]],
+ 'FinalJudgmentAmount': [data[2]],
+ 'ParcelID': [data[3]],
+ 'PropertyAddress1': [data[4]],
+ 'PropertyAddress2': [data[5]],
+ 'AssessedValue': [data[6]],
+ 'PlaintiffMaxBid': [data[7]]}
+
+ df = pd.DataFrame(data2, columns=['AuctionType','CaseNo','FinalJudgmentAmount','ParcelID','PropertyAddress1','PropertyAddress2','AssessedValue','PlaintiffMaxBid'] )
+ df2 = df2.append(df)
+ print(df)
+ return(df2)
+
+Here is the call
+ df = parseForclosure(source)
+
+Here is a snippet of what the html look like
+<table class="ad_tab" tabindex="0"><tbody><tr><th class="AD_LBL" scope="row">Auction Type:</th><td class="AD_DTA">FORECLOSURE</td></tr><tr><th aria-label="Case Number" class="AD_LBL" scope="row">Case #:</th><td class="AD_DTA"><a href="/index.cfm?zaction=auction&zmethod=details&AID=103757&bypassPage=1">07009032CA01</a></td></tr><tr><th class="AD_LBL" scope="row">Final Judgment Amount:</th><td class="AD_DTA">$323,248.61</td></tr><tr><th class="AD_LBL" scope="row">Parcel ID:</th><td class="AD_DTA">30-6901-001-2470</td></tr><tr><th class="AD_LBL" scope="row">Property Address:</th><td class="AD_DTA">12260 SW 191 ST</td></tr><tr><th class="AD_LBL" scope="row"></th><td class="AD_DTA">MIAMI, FL- 33177</td></tr> <tr><th class="AD_LBL" scope="row">Assessed Value:</th><td class="AD_DTA">$184,791.00</td></tr><tr><th class="AD_LBL" scope="row">Plaintiff Max Bid:</th><td class="AD_DTA ASTAT_MSGPB">Hidden</td></tr></tbody></table>
+
+You can see sample of all the tables in the link below.
+https://projectcodesamples.s3.amazonaws.com/AuctionSample.html
+My objective is place data points into a dataframe
+Sample file with missing data points:
+Sample_Missing_data_points
+This is a sample file with all datapoints
+Sample_file_with_no_missing_data_points
+Ideally I should be able to extract from both without the dataframe size changing
","Let's say that you have three HTML files with the data you provided since you first posted your question:
+
+- Source.html
+- Source2.html
+- Source3.html
+
+I have used this updated code to combine all the data in one dataframe:
+import io
+import csv
+
+from bs4 import BeautifulSoup
+import pandas as pd
+
+input_files_names = [
+ 'Source.html',
+ 'Source2.html',
+ 'Source3.html'
+]
+def setup_dataframes(files_names):
+ for current_file_name in files_names:
+ with open(current_file_name) as source_file:
+ soup = BeautifulSoup(source_file, 'html.parser')
+
+ field_labels = {
+ 'AuctionType': 'Auction Type:',
+ 'CaseNo': 'Case #:',
+ 'JudgementAmount': 'Final Judgment Amount:',
+ 'ParcelID': 'Parcel ID:',
+ 'AssessedValue': 'Assessed Value:',
+ 'PlaintiffMaxBid': "Plaintiff Max Bid:"
+ }
+
+ column_names = (
+ 'AuctionType',
+ 'CaseNo',
+ 'JudgementAmount',
+ 'ParcelID',
+ 'PropertyAddress1',
+ 'PropertyAddress2',
+ 'AssessedValue',
+ 'PlaintiffMaxBid'
+ )
+
+ def extract_data(soup):
+ for current_table in soup.find_all('table', class_='ad_tab'):
+ current_auction = {}
+ for (current_field, current_labal) in field_labels.items():
+ current_field_cell = current_table.tbody.find('th', string=current_labal)
+ if current_field_cell is not None:
+ current_data_cell = current_field_cell.next_sibling
+ current_auction[current_field] = current_data_cell.get_text()
+
+ address_row = current_table.tbody.find('th', string='Property Address:')
+ if address_row is not None:
+ current_auction['PropertyAddress1'] = address_row.find_next_sibling('td').get_text()
+
+ address2_row = address_row.parent.next_sibling.td
+ if address2_row is not None:
+ current_auction['PropertyAddress2'] = address2_row.get_text()
+
+ yield tuple(current_auction.get(current_field, '') for current_field in column_names)
+
+ with io.StringIO() as intermediate_data:
+ intermediate_csv = csv.writer(intermediate_data)
+ intermediate_csv.writerows(extract_data(soup))
+ intermediate_data.seek(0, 0)
+ df = pd.read_csv(intermediate_data, header=None, names=column_names)
+
+ yield df
+
+df_composite = pd.concat(setup_dataframes(input_files_names), ignore_index=True)
+print(df_composite)
+
+What has been done here is:
+
+- Extracting the text from the source HTML file by finding each file before creating a output row
+- Creating a temporary, in memory, CSV file using io.StringIO and the csv module
+- Creating a Pandas dataframe from that CSV file using pd.read_csv()
+
+If you are processing a lot of data you may consider writing to a real file instead instead of using an in-memory file.
",python
+"Change how PyAD searches for UsersI am working on creating a python script that can connect to AD and search for user attributes such as (name, email, location, email, extension). Currently I am searching users by CN to find their AD account. The problem I am running into is that some users have a middle initial in their CN but not on their display name. Is it possible to search a user by their display name or sAMAccount name to then be able to pull the attributes from their AD account?
+The script is below and works fine when search by CN.
+from tkinter import N
+from pyad import*
+from pyad import adquery
+from pyad import aduser
+from nameparser import HumanName
+from nameparser.config import CONSTANTS
+
+from StatesFun import StatesL
+
+#connecting to AD
+pyad.set_defaults (ldap_server="", Adminusername="", password="")
+UserName = input("Please input the username of the user requesting a DAT account, (first lastname, not case sensitive)\n")
+
+#Searching user in AD
+user = pyad.aduser.ADUser.from_cn(UserName)
+
+#searching for user attributes
+#pop takes element out of list and converts to string
+nameAD = user.get_attribute("cn")
+name = nameAD.pop(0)
+emailAD = user.get_attribute("mail")
+email = emailAD.pop(0)
+stAD = user.get_attribute("st")
+st = stAD.pop(0)
+extAD = user.get_attribute("telephoneNumber")
+ext = extAD.pop(0)
+
+#Parses name for initials
+def initials(full_name):
+ initial=""
+ if (len(full_name) == 0):
+ return
+
+ first_middle_last = full_name.split(" ")
+ for name in first_middle_last:
+ initial=initial+name[0].upper()+""
+ return initial
+
+#Splits First / Last Name into own text values
+Hname = HumanName(name)
+Hname = Hname
+
+#Parses TQL Username from Email
+DatUsrNameAD = (email.split('@'))
+DatUsrName = DatUsrNameAD.pop(0)
+
+print(DatUsrName)
+print(Hname.first)
+print(Hname.last)
+print(initials(name))
+print(StatesL(st))
+print(ext)
+print(email)
+
","You could use something like this to search by SamAccountName. Would just need to update the base_dn to match you company's domain settings.
+import pyad.adquery
+
+q = pyad.adquery.ADQuery()
+
+user = 'abc123'
+
+q.execute_query(
+ attributes = ["departmentNumber"],
+ where_clause = f"SamAccountName = '{user}'",
+ base_dn="DC=*,DC=*,DC=*"
+)
+
+for row in q.get_results():
+ dept = row["departmentNumber"]
+ print (dept)
+
",python
+"Turning column of list of lists (of unequal length) into separate variable columns (python, pandas)I'm having trouble turning a column of lists of lists into separate columns. I have a bad solution that works by working on each row independently and then appending them to each other, but this takes far too long for ~500k rows. Wondering if someone has a better solution.
+Here is the input:
+>>> import pandas as pd
+>>> import numpy as np
+>>> pd.DataFrame({'feat': [[["str1","", 3], ["str3","", 5], ["str4","", 3]],[["str1","", 4], ["str2","", 5]] ]})
+
+
+
+
+
+ |
+feat |
+
+
+
+
+| 0 |
+[[str1, , 3], [str3, , 5], [str4, , 3]] |
+
+
+| 1 |
+[[str1, , 4], [str2, , 5]] |
+
+
+
+
+Desired output:
+>>> pd.DataFrame({'str1': [3, 4], 'str2': [np.nan,5] , 'str3': [5,np.nan], 'str4': [3,np.nan]})
+
+
+
+
+
+ |
+str1 |
+str2 |
+str3 |
+str4 |
+
+
+
+
+| 0 |
+3 |
+NaN |
+5 |
+3 |
+
+
+| 1 |
+4 |
+5 |
+NaN |
+NaN |
+
+
+
+
+Update: Solved by @ifly6! Fastest solution by far. For 100k rows and 80 total variables, the total time taken was 8.9 seconds for my machine.
","Loading your df, create df1 as follows:
+df1 = pd.DataFrame.from_records(df.explode('feat').values.flatten()).replace('', np.nan)
+df1.index = df.explode('feat').index
+
+Set index on df1 from the original data to preserve row markers (passing index=df.explode('feat').index does not work). (Alternatively, to get to the point where you have separated the lists into columns, you could use df.explode('feat')['feat'].apply(pd.Series). I prefer, however, to avoid apply so use the DataFrame constructor instead.)
+Reset index on df1 then set multi-index (cannot set the column 0 index directly because it overwrites the original index):
+df1.reset_index().set_index(['index', 0])
+# df1.set_index(0, append=True) # alternatively should work
+
+Then unstack. You can drop columns that are all NaN by appending .dropna(how='all', axis=1), yielding:
+>>> df1.reset_index().set_index(['index', 0]).unstack().dropna(how='all', axis=1)
+ 2
+0 str1 str2 str3 str4
+index
+0 3.0 NaN 5.0 3.0
+1 4.0 5.0 NaN NaN
+
+This solution also largely avoids hard-coding which specific columns to look at or move about.
",python
+"Optional positional argument, that only accepts values from a specified listI'm rewriting a legacy C program in Python 3, using argparse. The program takes zero, one or more positional arguments, that have to be from a specified list. Let's say the possible values are 'A', 'B', 'C', 'D' and 'E', for simplicity's sake. There are no other arguments in the legacy program, and I don't expect them in the new version. But you never know. :-)
+If I add the argument without choices, like this:
+p.add_argument("action", help = "What to do", nargs='*')
+
+It works perfectly, I can supply zero, one or more of that argument.
+But if I specify the choice list, like this:
+p.add_argument("action", help = "What to do", choices=["A", "B", "C", "D", "E"], nargs='*')
+
+I can no longer specify zero arguments. I get this error:
+
+error: argument action: invalid choice: [] (choose from 'A', 'B', 'C',
+'D', 'E')
+
+Is there any way to be able to add an argument that will accept zero, one or more arguments from a specified list?
","You can use the add_argument method once for each option in your list, supplying the whole list as possible choices for each argument, and then use the '?' for the nargs field
+for example here:
+import sys
+import argparse
+
+parser = argparse.ArgumentParser(sys.argv[0])
+choices = ["A", "B", "C", "D", "E"]
+for i, choice in enumerate(choices):
+ parser.add_argument(choice, metavar=choice,
+ help="Help Message For Choice" + choice,
+ choices=choices, nargs='?')
+
+args = parser.parse_args()
+
+This will accept 0, 1, 2, ... len(choices) arguments. You will probably want to override the usage string if you go this route.
+python3 name_of_pythonfile.py -h
+usage: name_of_pythonfile.py [-h] [A] [B] [C] [D] [E]
+
+positional arguments:
+ A Help Message for ChoiceA
+ B Help Message for ChoiceB
+ C Help Message for ChoiceC
+ D Help Message for ChoiceD
+ E Help Message for ChoiceE
+
+options:
+ -h, --help show this help message and exit
+
+
+To override the default usage message you just need to pass a string to the usage keyword argument in the ArgumentParser constructor.
+parser = ArgumentParser(prog, usage="My New Usage String")
+
+argparse docs
",python
+"Printing 2 dictionaries at the same line produces different output compared to separateSo when I print the dictionaries alone (price[1] and price[2]) they print the desired output (different outputs), but when I print both at the same time in the same line they produce the same output (both are literally the same)
+price[1] = driver.find_elements(By.XPATH, """//div[contains(@aria-label, 'dollars')]""") # Get Prices from Calendar
+time.sleep(1.5)
+for i in range(3):
+ print(str(price[1][i].get_attribute("innerHTML")))
+#-----------------------------------------
+vero = driver.find_element(By.XPATH, """//span[contains(text(),'Next')]/following-sibling::button""") # Click The Next Button
+vero.click()
+time.sleep(3)
+print("\n")
+#-----------------------------------------
+price[2] = driver.find_elements(By.XPATH, """//div[contains(@aria-label, 'dollars')]""") # Get Prices from Calendar
+time.sleep(1.5)
+for i in range(3):
+ print(str(price[2][i].get_attribute("innerHTML")))
+
+So separated from each other they produces an output like this:
+$101
+$200
+$305
+
+$456
+$789
+$890
+
+But when I try to print them in the same line at the end of the code:
+for i in range(3):
+ print(str(price[1][i].get_attribute("innerHTML")) + " <><><> " + str(price[2][i].get_attribute("innerHTML")))
+
+It produces this repetition! :
+$101 <><><> $101
+$200 <><><> $200
+$305 <><><> $305
+
+How do I produce this desired outcome? :
+$101 <><><> $456
+$200 <><><> $789
+$305 <><><> $890
+
","If your print statements are OK. This should work, otherwise it seems your Web driver is fetching same data for the two prices.
+Price_values_1 = []
+
+for i in range(3):
+ Price_values_1.append(str(price[1][i].get_attribute("innerHTML")))
+#-----------------------------------------
+Price_values_2 = []
+
+for i in range(3):
+ Price_values_2.append(str(price[2][i].get_attribute("innerHTML")))
+
",python
+"Trying to make a leveling system, however it only works once and then stops working?I'm making a leveling system and it only levels me up once and then stops working. Once it levels me the xp doesn't reset and my level does not go up. Here's the code!
+level = int(1)
+crexp = int(260)
+reqxp = int(100)
+while crexp >= reqxp:
+ level = level+1
+ crexp = crexp-reqxp
+ reqxp = (reqxp/100)*120
+ continue
+while 3 > 2:
+ pinput = input()
+ if pinput == "1":
+ crexp = crexp + 60
+ elif pinput == "2":
+ print(level)
+ elif pinput == "3":
+ print(crexp)
+ elif pinput == "4":
+ print(reqxp)
+ elif pinput == "5":
+ break
+
","The problem with your current code is that you are not rerunning the 'level up' part of the code. Python generally (when not in a while/for loop e.c.t) reads your code from top to bottom. This means by the time you get into the second while loop the first while loop has finished and will never be run again.
+To fix this you want to tell python to recalculate the level and experience variables at certain points - the easiest way to do this is to make the first while loop into a function and call it at the start of the second while loop. You would get something like this -
+def checkLevelUp(currentXp, requiredXp, currentLevel):
+ while currentXp >= requiredXp:
+ currentLevel = currentLevel+1
+ currentXp = currentXp-requiredXp
+ requiredXp = int(requiredXp * 1.2)
+ return currentLevel, currentXp, requiredXp
+
+
+level = 1
+crexp = 260
+reqxp = 100
+
+
+while True:
+ level, crexp, reqxp = checkLevelUp(crexp, reqxp, level)
+ pinput = input()
+ if pinput == "1":
+ crexp = crexp + 60
+ elif pinput == "2":
+ print(level)
+ elif pinput == "3":
+ print(crexp)
+ elif pinput == "4":
+ print(reqxp)
+ elif pinput == "5":
+ break
+
+Note also the changes to calculating the next required xp - dividing by 100 and then multiplying by 120 is just the same as multiplying by 1.2.
",python
+"Convert a string using regex_replace filter python and ansiblei would convert all the lines here @cat\n@chicken\napple\nfruit\njuice into + : @cat : all\n+ :@chicken: all\n+ :apple : all\n+ : fruit : all\njuice : all in other hand i would get this for every line + : value : all
+i would use regex_replace filter to perform the task, i don't have too much knowledge on python, i am trying to do this:
+{{ '@cat\n@chicken\napple\nfruit\njuice'| regex_replace('^(?P<name>)$', '\\g<name>: ALL' , multiline=True, ignorecase=True)}} but nothing happens, i am missing something here
","You can use
+regex_replace(r'.+', r'+ : \g<0> : all' )
+
+to wrap each non-empty line with + : <line_here> : all text.
+Note that . matches CR chars, too, and if the line endings are CRLF, you will have to replace . with [^\r\n].
+Here, \g<0> is a replacement backreference to the whole match value, no need using named capturing groups.
",python
+"is it possible to add code to a list in pythonimport PySimpleGUI as sg
+
+rows_needed = 2
+result = [['text1', 'text2'], ['text3']]
+menu_layout = []
+
+for x in range(0,rows_needed):
+ temp = []
+ try:
+ temp.append(sg.Button(c) for c in result[x])
+ finally:
+ pass
+ menu_layout.append(temp)
+ print(menu_layout)
+layout = [[sg.Button(c) for c in result]]
+window = sg.Window('', menu_layout)
+
+window.read()
+
+
+so im attempting to create a nested list for menu layout,
+the result i want would be for example
+menu_layout = [[sg.Button('text1'), sg.Button('text2')], [sg.Button('text3')],]
+
+im using pysimplegui
+my current code at the top gives the following result in powershell
+
+[[<generator object menu.<locals>.<genexpr> at 0x0000016308733AE0>]]
+[[<generator object menu.<locals>.<genexpr> at 0x0000016308733AE0>],
+[<generator object menu.<locals>.<genexpr> at 0x0000016308733C30>]]
+Traceback (most recent call last): File
+"C:\Users\cafemax\projects\POS\POS\Client_Posv2.py", line 713, in
+<module>
+ menu() File "C:\Users\cafemax\projects\POS\POS\Client_Posv2.py", line 532, in menu
+ window = sg.Window('', menu_layout) File "C:\Users\cafemax\.venvs\stockcontrol\lib\site-packages\PySimpleGUI\PySimpleGUI.py",
+line 9604, in __init__
+ self.Layout(layout) File "C:\Users\cafemax\.venvs\stockcontrol\lib\site-packages\PySimpleGUI\PySimpleGUI.py",
+line 9783, in layout
+ self.add_rows(new_rows) File "C:\Users\cafemax\.venvs\stockcontrol\lib\site-packages\PySimpleGUI\PySimpleGUI.py",
+line 9753, in add_rows
+ self.add_row(*row) File "C:\Users\cafemax\.venvs\stockcontrol\lib\site-packages\PySimpleGUI\PySimpleGUI.py",
+line 9708, in add_row
+ if element.ParentContainer is not None: AttributeError: 'generator' object has no attribute 'ParentContainer'
+
+
+the reason im trying to do this is because i need to be able to generate a variable amount of buttons based on the size of a list so i can't hard code this.
+any help on how to fix this or change must make to my append
","temp doesn't contain sg.Button objects, it contains generators because that's what you appended to it. You don't want to create temp and then append the generator to it, you want to extend your list with your generator. See What is the difference between Python's list methods append and extend?
+for x in range(0, rows_needed):
+ temp = []
+ try:
+ temp.extend(sg.Button(c) for c in result[x])
+ finally:
+ pass
+ menu_layout.append(temp)
+
+Alternatively, you can simply create temp using a list comprehension:
+for x in range(0, rows_needed):
+ temp = []
+ try:
+ temp = [sg.Button(c) for c in result[x]]
+ finally:
+ pass
+ menu_layout.append(temp)
+
+I'm not sure what the try..finally is for, it doesn't seem to be doing anything, but I left it in because that's not the question you're asking.
",python
+"Cleaning string column that contains float numberI am trying to remove the point and zero from every float value within this dataset
+ index CIP
+ 1 DF5TY34
+ 2 12342.0
+ 3 de44dW
+
+(CIP is casted as String)
+I wrote this line to resolve the problem but its not doing anything and I'm recieving only a warning no errors:
+ pro1[pro1['CIP'].str.contains('\..')]["CIP"] = pro1.loc[pro1['CIP'].str.contains('\..')]["CIP"].astype(float).astype(int).astype(str)
+
+this is the warning:
+/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning:
+A value is trying to be set on a copy of a slice from a DataFrame.
+Try using .loc[row_indexer,col_indexer] = value instead
+
+See the caveats in the documentation: https://pandas.pydata.org/pandas-
+docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
+"""Entry point for launching an IPython kernel.
+
","For a strict replacement of .0, you can use removesuffix:
+df['CIP'] = df['CIP'].str.removesuffix('.0')
+
+For a more flexible approach, use a regex with str.replace:
+df['CIP'] = df['CIP'].str.replace('\.0*$', '', regex=True)
+
+output:
+ index CIP
+0 1 DF5TY34
+1 2 12342
+2 3 de44dW
+
+regex:
+\. # match a dot
+0* # match any number of 0 (including none)
+$ # match end of line
+
",python
+"How to find string with spaces between two characters using Regex?Currently I have a string that I want to parse and pick up certain values.
+The current regex findall pattern that I have is:
+re.findall(r'(?P<key>\w+)\s+(?P<value>\w+)')
+
+With this regex findall pattern I can pick up the key and values of the following:
+--key1=value1 --key2=value2
+
+But if the value is a string with spaces, it doesn't pick it up. Examples that doesn't work:
+--key1=this is value 1 --key2=value2
+--key1=only kvp
+--key1=this/doesnt/work/
+
+How can I adjust the regex pattern to pick up the string after the = sign?
","I started by changing your regex to --(?P<key>\w+)=(?P<value>\w+). This way, it uses "=" instead of a whitespace as a separator between key and value. It also requires "--" to precede the key, which seems to be a rule in your data.
+Now let tackle the main problem which is to capture as a value everything after the "=" sign unless it is the next key.
+This can be done in three steps:
+
+Change regex for the value from \w+ to .+. You want to capture all characters so you cannot limit yourself to just \w. . will capture everything. Of course this change caused a new problem: the value will now contain everything that follows the key, even if it is "value1 --key2=value2". This will be fixed in the remaining two steps.
+
+The next step is to make the regex non-greedy. Change the regex for value from .+ to .+? and it will capture the least characters it can instead of the most. This still doesn't solve the problem because the regex will capture only one character of the value. We are a step closer, though.
+
+The last step is to prevent the regex from stopping capturing the value until it encounter the next key or the end of the string. Add (?=$|\s--) at the end. (?=) is a positive lookahead. It means that the next part must follow the current position but it is not part of the match itself. $|\s-- is an alternation of either end of the string or a whitespace and two dashes.
+
+
+The finished regex is:
+re.findall(r'--(?P<key>\w+)=(?P<value>.+?)(?=$|\s--)', string)
+
+It should handle everything other than a value that contains --.
+For example:
+import re
+string = "--key1=value 1 has--really .:weird:. characters --key2=value2"
+result = re.findall(r'--(?P<key>\w+)=(?P<value>.+?)(?=$|\s--)', string)
+print(result)
+
+gives:
+[('key1', 'value 1 has--really .:weird:. characters'), ('key2', 'value2')]
+
",python
+"Airflow: unable to configure handler ""processor""I'm very very new to Apache-airflow, just started a course in udemy (this course) .
+We have recieved YAML file and were asked to follow instructions in order to install airflow.
+I have been able to install airflow, I beleive, because I have gotten any error while doing the next steps:
+
+- create a new file new_file.env (Visual Studio)
+- add these lines inside new file and save it :
+
+AIRFLOW_IMAGE_NAME=apache/airflow:2.3.0
+AIRFLOW_UID=50000
+
+
+- open the terminal and run
+
+docker-compose up -d
+
+Then when I tried to open localhost:8080 nothing was opened.
+When I checked the containers, using
+docker-compose ps
+
+I have seen that some of the containers are not healthy.
+docker logs materials_name_of_the_container #here I inserted every tome different container name
+
+The most common error was "unable to configure handler "processor"
+I have seen here post with similar problem that recommened to use pip to install Azure ,and I have tried to print this in the terminal:
+pip install 'apache-airflow[azure_blob_storage,azure_data_lake,azure_cosmos,azure_container_instances]
+
+#also tried with one more '
+
+pip install 'apache-airflow[azure_blob_storage,azure_data_lake,azure_cosmos,azure_container_instances]'
+
+I got an error that pip is not recognized:
+![]()
+My goal is to complete the installation , and "on -the-way" to understand why these errors happened. I beleive the the YAML file is well (because it's an organized course), but no idea where or what happenned inside the contaienrs, and where these containers are located, so any "dummy" explaination for beginner is welcome.
","In the end I could solve this issue , after reading this troubleshooting page. I was trying to configure the users&groups, then I have realized that for some reason, my user is not the administrator. I have changed it, uninstall everything (docker+visual studio) and then installed it again , and- yayyy! it worked and this error didn't appear.
+so the steps I did:
+~the foirst part of changing to administrator- I have done yesterday and I don't remember 100% how, I think it was this, if it's wrong please let me know...
+
+- Enter the users&groups - that was not available in my windows 10, if I remember correct, I have pressed "windows" +"R" on the keyboard, and then I typed "netplwiz" and pressed enter.
+
![]()
+2.double click on the username, and this tab was opened:
+
+![]()
+
+- select administraor (the screenshow is after the fix so maybe it doesn't have the docker options)
+
+4. restart the computer
+5. uninstalled docker and visual studio
+6. installed docker and visual studio again
+7. docker-compose up -d ....
+then after few minutes it worked.
+Thanks for everyone who tried to help me :)
+let me know if there is somethin g to improve in this answer.
",python
+"Is there a vectorized way to find maxes within labeled areas in NumPy?I have a 2D array representing tree heights, where 0 is the ground. I have another array that's always the same size showing segmented and labeled trees, where a 0 label means ground, and a positive integer value represents a unique tree. Here are some slices of the data:
+heights = array([[37.5 , 41.82, 42.18, 42.18, 42.18, 39.23, 40.68, 40.71, 40.71,
+ 40.19, 35.03, 41.41, 41.41, 41.41, 40.77, 32.23, 32.23, 32.23,
+ 31.45, 25.6 , 25.63, 30.12, 30.78, 30.78, 30.92],
+ [37.5 , 37.5 , 41.82, 42.18, 41.78, 41.78, 40.68, 40.68, 40.68,
+ 40.19, 41.04, 41.41, 41.41, 41.41, 41.03, 32.23, 32.23, 32.23,
+ 31.25, 25.6 , 25.6 , 30.12, 30.12, 21.08, 30.88],
+ [37.5 , 37.5 , 34.61, 41.78, 41.78, 25.6 , 39.14, 40.68, 38.79,
+ 38.79, 41.04, 41.04, 41.8 , 41.8 , 41.8 , 24.66, 24.66, 31.25,
+ 25.63, 26.24, 26.2 , 25.2 , 24.93, 21.03, 21.03],
+ [34.53, 34.61, 34.61, 35.23, 35.23, 25.32, 25.32, 33.17, 33.17,
+ 38.86, 39.4 , 40.31, 41.8 , 41.8 , 41.8 , 41.17, 25.37, 26.77,
+ 27.32, 27.39, 27.39, 26.96, 25.2 , 28.68, 28.68],
+ [34.53, 34.52, 36.5 , 36.58, 36.67, 36.67, 25.15, 33.17, 38.65,
+ 38.86, 39.4 , 39.53, 40.78, 41.17, 41.17, 0. , 26.77, 27.09,
+ 27.39, 27.6 , 27.6 , 28. , 28.16, 28.68, 28.68],
+ [32.22, 36.45, 37.1 , 37.28, 37.28, 38.07, 30.98, 31.12, 38.65,
+ 38.65, 39.12, 39.4 , 40.78, 40.78, 0. , 0. , 27.41, 27.72,
+ 27.72, 28.49, 28.49, 28.16, 28.34, 28.87, 28.68],
+ [36.45, 37.1 , 37.1 , 37.28, 38.23, 38.23, 38.23, 33.61, 32.31,
+ 38.65, 38.65, 38.62, 39.01, 33.75, 34.65, 34.65, 27.41, 27.72,
+ 27.72, 28.49, 28.49, 28.49, 28.87, 30.31, 30.31],
+ [35.71, 36.45, 37.1 , 30.96, 38.23, 38.23, 38.23, 33.61, 33.28,
+ 33.42, 33.5 , 33.5 , 33.51, 34.07, 34.65, 34.65, 27.36, 27.83,
+ 27.83, 28.49, 28.49, 28.43, 28.87, 31.82, 31.68],
+ [14.44, 0. , 0. , 0. , 21.41, 32.98, 33.61, 33.61, 34.27,
+ 34.8 , 34.8 , 33.5 , 33.4 , 34.07, 34.65, 34.65, 0. , 27.83,
+ 27.83, 28.7 , 29.18, 29.18, 31.82, 31.82, 31.98],
+ [13.46, 0. , 0. , 21.41, 21.73, 31.36, 33.33, 33.33, 34.89,
+ 34.99, 34.99, 32.72, 33.4 , 33.8 , 33.8 , 0. , 0. , 0. ,
+ 28.7 , 28.7 , 29.64, 29.64, 31.82, 31.82, 35.82],
+ [13.46, 0. , 0. , 0. , 21.73, 31.36, 31.46, 35.81, 36.33,
+ 36.33, 36.33, 32.72, 33.37, 33.71, 33.71, 0. , 0. , 0. ,
+ 28.7 , 29.64, 29.64, 29.77, 29.77, 29.77, 35.95],
+ [ 0. , 0. , 0. , 0. , 0. , 24.07, 31.57, 35.9 , 36.33,
+ 36.33, 36.33, 21.97, 32.72, 33.37, 33.37, 0. , 0. , 0. ,
+ 28.36, 29.04, 29.64, 29.77, 29.77, 29.77, 35.95],
+ [ 0. , 0. , 0. , 0. , 22.09, 24.07, 23.92, 31.57, 35.9 ,
+ 36.33, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 28.38, 29.53, 28.96, 28.96, 28.69, 29.19, 35.49],
+ [ 0. , 0. , 0. , 0. , 22.09, 22.09, 22.09, 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 29.53, 29.53, 29.82, 28.96, 28.73, 29.19, 29.19],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 29.53, 30.12, 30.12, 29.82, 28.73, 0. , 28.89],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 30.12, 30.12, 30.12, 28.94, 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 30.12, 30.12, 29.82, 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 28.65, 28.65, 0. , 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. ],
+ [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
+ 0. , 0. , 0. , 0. , 0. , 0. , 0. ]], dtype=float32)
+
+labeled_trees = array([[33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37,
+ 37, 37, 37, 37, 39, 39, 39, 39, 39],
+ [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
+ 37, 37, 37, 37, 39, 39, 39, 39, 39],
+ [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
+ 37, 37, 37, 39, 39, 39, 39, 39, 39],
+ [33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,
+ 37, 37, 39, 39, 39, 39, 39, 39, 39],
+ [33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37, 0,
+ 39, 39, 39, 39, 39, 39, 39, 39, 39],
+ [33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 0, 0,
+ 39, 39, 39, 39, 39, 39, 39, 39, 39],
+ [33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,
+ 37, 39, 39, 39, 39, 39, 39, 39, 39],
+ [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
+ 37, 39, 39, 39, 39, 39, 39, 39, 39],
+ [33, 0, 0, 0, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37,
+ 0, 39, 39, 39, 39, 39, 39, 39, 39],
+ [33, 0, 0, 33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 0,
+ 0, 0, 39, 39, 39, 39, 39, 39, 39],
+ [33, 0, 0, 0, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 0,
+ 0, 0, 39, 39, 39, 39, 39, 39, 39],
+ [ 0, 0, 0, 0, 0, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 0,
+ 0, 0, 39, 39, 39, 39, 39, 39, 39],
+ [ 0, 0, 0, 0, 33, 33, 33, 33, 33, 33, 0, 0, 0, 0, 0, 0,
+ 0, 0, 39, 39, 39, 39, 39, 39, 39],
+ [ 0, 0, 0, 0, 33, 33, 33, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 39, 39, 39, 39, 39, 39, 39],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 39, 39, 39, 39, 39, 0, 39],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 39, 39, 39, 39, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 39, 39, 39, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 39, 39, 0, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=int32)
+
+I'd like to find the max height within each labeled region. I have done this successfully with a for loop, but it's slow.
+max_heights = {}
+for label in list(np.unique(labeled_trees))[1:]:
+ tree_height = np.amax(heights[labeled_trees == label])
+ max_heights[str(label)] = tree_height
+
+# max_heights = {'33': 42.18, '37': 41.8, '39': 35.95}
+
+Is there a faster/vectorized/more efficient way of finding the max values within labeled regions of a numpy array? The ideal output would be a boolean array where the location of each max is True.
+[EDIT]
+The maximum_position function from scipy.ndimage is promising, but it looks like the it only returns the first location where the pixel equals the local max. I need every location within a labeled region that equals its max.
","Here is a much simpler use of np.maximum.reduceat:
+idx = labeled_trees.argsort(None)
+sorted_labeled_trees = labeled_trees.ravel()[idx]
+sorted_heights = heights.ravel()[idx]
+bins = np.flatnonzero(np.diff(sorted_labeled_trees) != 0) + 1
+max_heights = np.maximum.reduceat(sorted_heights, bins)
+max_trees = sorted_labeled_trees[bins]
+
+If you insist on a dictionary, you can make one with zip:
+result = dict(zip(max_trees, max_heights))
+
+If you want a mask of the positions where the maxima occur and the number of trees is relatively small, you can compute the mask more-or-less directly using broadcasting:
+peak_mask = ((max_trees == labeled_trees[..., None]) & (max_height == heights[..., None])).any(-1)
+
+If the number of trees is not small, you will be better off using a loop over the labels:
+peak_mask = np.zeros(labeled_trees.shape, bool)
+for t, h in zip(max_trees, max_height):
+ peak_mask |= (labeled_trees == t) & (heights == h)
+
",python
+"Why the sequence of data in a heap is not like what I thoughtI'm learning the Heap data structure and found this link
+I'm expecting the output of heapq.heapify([5, 7, 9, 1, 3]) to be
+[1, 3, 9, 5, 7]
+
+However, I saw it returns this:
+[1, 3, 9, 7, 5]
+
+As some suggested that the order on the same level does not matter, then I guess I just feel confused that how does the [5,7] in the original sequence swapped position since the order in the same level does not matter.
+Could someone explain why it is like this?
","Note that the Heap data structure only guarantees the top to be the minimum [or maximum]. It isn't anything like a Binary Search Tree, which I gather is what you're expecting with the order.
+The invariant for a Binary Search Tree is that that for every node x, all the keys in the left subtree should be smaller than x and all the keys in the right subtree should be greater than x.
+However, in a Heap, the invariant is just that that every node x should be greater [or smaller] than its children. Note how it doesn't specify anything about its left or right subtree.
+The sequence [1, 3, 9, 7, 5] is a valid heap. Note how every parent is smaller than its children.
+ 1
+ 3 9
+ 7 5
+
",python
+"Flask Rest API SQLAlchemy foreign key errorI got SQLALchemy error, when I tried "flask db migrate"
+NoReferencedTableError: Foreign key associated with column 'user.menu_id' could not find table 'menu' with which to generate a foreign key to target column 'id
+
+Menu table
+class Menu(db.Model):
+ __tablename__ = 'menus'
+ id = db.Column(db.Integer(), primary_key=True)
+ name = db.Column(db.String(64), index=True, unique=True)
+ price = db.Column(db.String(64), index=True, unique=True)
+ description = db.Column(db.String(64), index=True, unique=True)
+ picture = db.Column(db.String(64), index=True, unique=True)
+ create_date = db.Column(db.DateTime, default=datetime.utcnow)
+ users = db.relationship('User', backref="menu", lazy=True)
+
+User table
+class User(Model):
+""" User model for storing user related data """
+
+ id = Column(db.Integer, primary_key=True)
+ email = Column(db.String(64), unique=True, index=True)
+ username = Column(db.String(15), unique=True, index=True)
+ name = Column(db.String(64))
+ password_hash = Column(db.String(128))
+ admin = Column(db.Boolean, default=False)
+
+ joined_date = Column(db.DateTime, default=datetime.utcnow)
+ userdataset = db.relationship("Dataset", backref="user", lazy="dynamic")
+ menu_id = Column(db.Integer(), db.ForeignKey('menu.id'), nullable=False)
+ def __init__(self, **kwargs):
+ super(User, self).__init__(**kwargs)
+
+How can ı solve this problem? Where am i doing wrong?
","You have renamed your 'Menu' table to 'menus' with this __tablename__ property in your 'Menu' model:
+__tablename__ = 'menus'
+
+You then try to reference to the 'Menu' table, when in fact, its name has been changed to 'menus'. The simplest way to solve this would be to change your User.menu_id column to this:
+menu_id = Column(db.Integer(), db.ForeignKey('menus.id'), nullable=False)
+
+Another way of fixing this issue would be modifying the __tablename__ property to 'menu'. (You could also just delete it.)
",python
+"Altair Grouped Bar Chart With Multiple ConditionsI have this DataFrame called table:
+ TERM Bitcoin S&P500 Real Estate Gold
+0 High-Inflation/ Short term 3097.94 -3700.78 761.23 6512.71
+1 High-Inflation/ Mid term — -3080.01 -8434.66 3242.40
+2 High-Inflation/ Long term — -2089.25 -9117.96 8174.43
+3 Low-Inflation/ Short term 780200.00 -273.71 1824.72 2214.51
+4 Low-Inflation/ Mid term 21013600.00 5331.40 35810.58 -2879.37
+5 Low-Inflation/ Long term 978017143.00. 15045.41 35895.81 861.90
+
+And I want to make a grouped (or stacked) bar chart that distinguishes return on investments for each of these assets based on the TERM column. I have tried this:
+alt.Chart(table).transform_fold(
+ ["Bitcoin", "S&P500", "Real Estate", "Gold"], as_=["key", "value"]
+).mark_bar().encode(
+ x="key:N",
+ y="value:Q",
+ color="key:N",
+ column="TERM",
+ )
+
+But that doesn't work.
","It seems to work fine, the only problem is that the values are at such vastly different scales that only the largest shows up on a linear scale. You can address this by switching to a symlog scale. For example:
+import pandas as pd
+import io
+import altair as alt
+
+table = pd.read_csv(io.StringIO("""\
+row TERM Bitcoin S&P500 "Real Estate" Gold
+0 "High-Inflation/ Short term" 3097.94 -3700.78 761.23 6512.71
+1 "High-Inflation/ Mid term" — -3080.01 -8434.66 3242.40
+2 "High-Inflation/ Long term" — -2089.25 -9117.96 8174.43
+3 "Low-Inflation/ Short term" 780200.00 -273.71 1824.72 2214.51
+4 "Low-Inflation/ Mid term" 21013600.00 5331.40 35810.58 -2879.37
+5 "Low-Inflation/ Long term" 978017143.00. 15045.41 35895.81 861.90
+"""), delim_whitespace=True)
+
+alt.Chart(table).transform_fold(
+ ["Bitcoin", "S&P500", "Real Estate", "Gold"], as_=["key", "value"]
+).mark_bar().encode(
+ x="key:N",
+ y=alt.Y("value:Q", scale=alt.Scale(type='symlog')),
+ color="key:N",
+ column="TERM",
+)
+
+![]()
",python
+"Need help identifying right XPathI'm trying to scrape all of the table from this website : https://qmjhldraft.rinknet.com/results.htm?year=2018
+When the XPath is a simple td (like the names for example), I can scrape the table with the simple xpath being something like this :
+players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
+
+And I can scrape the players name using this code :
+from selenium import webdriver
+from selenium.webdriver.common.by import By
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.support import expected_conditions as EC
+
+PATH = 'C:\Program Files (x86)\chromedriver.exe'
+driver = webdriver.Chrome(PATH)
+driver.get('https://qmjhldraft.rinknet.com/results.htm?year=2018')
+
+try:
+ elements = WebDriverWait(driver, 10).until(
+ EC.presence_of_element_located((By.XPATH, "//tr[@rnid]/td[1]"))
+ )
+finally:
+ players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
+
+for player in players[:5]:
+ pl = player.text
+ print(pl)
+
+But when I get to the "Height" section, I can't find the write XPath. I guess this has to do with the td having a class, "ht-itemVisibility1", changing the way to scrape it, I've tried a few different ways to scrape it, like :
+('//tr/td[@class="ht-itemVisibility1"][1]')
+('//tr/td[@class="ht-itemVisibility1"][5]')
+('//tr[@rnid]/td[5]')
+
+to no avail. Can someone enlighten me on the way to capature this XPath with td class? Thanks a lot.
","Try this
+from selenium import webdriver
+from webdriver_manager.chrome import ChromeDriverManager
+
+from selenium import webdriver
+from selenium.webdriver.common.by import By
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.support import expected_conditions as EC
+
+driver = webdriver.Chrome(ChromeDriverManager().install())
+driver.get('https://qmjhldraft.rinknet.com/results.htm?year=2018')
+
+try:
+ elements = WebDriverWait(driver, 10).until(
+ EC.presence_of_element_located((By.XPATH, "//tr[@rnid]/td[1]"))
+ )
+finally:
+ players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
+
+for player in players[:5]:
+ pl = player.text
+ print(pl)
+
+players_height = driver.find_elements_by_xpath('//tr/td[@class="ht-itemVisibility1"][1]')
+
+for player in players_height[:5]:
+ pl = player.text
+ print(pl)
+
+players_last_team = driver.find_elements_by_xpath('//tr/td[@class="ht-itemVisibility1"][5]')
+
+for player in players_last_team[:5]:
+ pl = player.text
+ print(pl)
+
+
+don't know why it wasn't working for you but it's working fine with me.
+Results:
+![]()
",python
+"Output missing dates by group of columnsI have time series of y per store and product stored in the following dataframe:
+ ds store product y
+0 2016-01-01 a salt 2
+1 2016-01-02 a salt 5
+2 2016-01-04 a salt 3
+3 2016-01-05 a salt 3
+4 2016-01-06 a salt 4
+5 2016-01-07 a salt 3
+6 2016-01-01 b pepper 2
+7 2016-01-02 b pepper 2
+8 2016-01-03 b pepper 1
+9 2016-01-04 b pepper 2
+10 2016-01-06 b pepper 4
+11 2016-01-07 b pepper 2
+
+I would like to output all the missing dates per store, product and return the following result:
+ ds store product
+0 2016-01-03 a salt
+1 2016-01-05 b pepper
+
","Use groupby_resample:
+# Assuming ds is datetime64 else use:
+# df['ds'] = pd.to_datetime(df['ds'])
+out = df.groupby(['store', 'product']).resample('D', on='ds')['y'] \
+ .first().loc[lambda x: x.isna()].index.to_frame(index=False)
+print(out)
+
+# Output
+ store product ds
+0 a salt 2016-01-03
+1 b pepper 2016-01-05
+
+Details:
+>>> df.groupby(['store', 'product']).resample('D', on='ds')['y'].first()
+store product ds
+a salt 2016-01-01 2.0
+ 2016-01-02 5.0
+ 2016-01-03 NaN # <- missing value == missing date
+ 2016-01-04 3.0
+ 2016-01-05 3.0
+ 2016-01-06 4.0
+ 2016-01-07 3.0
+b pepper 2016-01-01 2.0
+ 2016-01-02 2.0
+ 2016-01-03 1.0
+ 2016-01-04 2.0
+ 2016-01-05 NaN # <- missing value == missing date
+ 2016-01-06 4.0
+ 2016-01-07 2.0
+Name: y, dtype: float64
+
+Update: If you have a date in the ds column without a value in the y column, just use fillna({'y': 0}) before groupby_resample
",python
+"DRF: Can't create object null. Error: value in column ""network_from_id"" violates not-null constraintI want to create Transaction object. But have error:
+django/db/backends/utils.py", line 84, in _execute
+ return self.cursor.execute(sql, params)
+django.db.utils.IntegrityError: null value in column "network_from_id" violates not-null constraint
+DETAIL: Failing row contains (6, 12, null, null).
+
+What is wrong in my code? And what is the proper way to create object with ModelViewSet?
+My code:
+models.py
+class Networks(models.Model):
+ Name = models.CharField(max_length=50)
+ ...
+
+class Transaction(models.Model):
+ network_from = models.ForeignKey(Networks, on_delete=models.DO_NOTHING)
+ network_to = models.ForeignKey(Networks, on_delete=models.DO_NOTHING)
+ ...
+
+view.py
+class TransactionView(viewsets.ModelViewSet):
+ serializer_class = TransactionSerializer
+ queryset = Transaction.objects.all()
+
+ def get_transaction_create_serializer(self, *args, **kwargs):
+ serializer_class = TransactionSerializer
+ kwargs["context"] = self.get_serializer_context()
+ return serializer_class(*args, **kwargs)
+
+
+ def create(self, request, *args, **kwargs):
+ serializer = self.get_transaction_create_serializer(data=request.data)
+ serializer.is_valid(raise_exception=True)
+ self.perform_create(serializer)
+ headers = self.get_success_headers(serializer.data)
+ response = {"result": serializer.data}
+ return Response(
+ response, status=status.HTTP_201_CREATED, headers=headers
+ )
+
+serializers.py
+class TransactionSerializer(serializers.ModelSerializer):
+ network_from = serializers.SerializerMethodField()
+ network_to = serializers.SerializerMethodField()
+
+ class Meta:
+ model = Fee
+ fields = '__all__'
+
+ def create(self, validated_data):
+ instance = super().create(validated_data=validated_data)
+ return instance
+
+
+def get_network_from(self, obj):
+network = obj.network_from
+return NetworksSerializer(network).data
+ def get_network_to(self, obj):
+ network = obj.network_to
+ return NetworksSerializer(network).data
+
","Add null=True for field, if you accept None values in specific relation.
+network_from = models.ForeignKey(Networks, on_delete=models.DO_NOTHING, null=True)
+network_to = models.ForeignKey(Networks, on_delete=models.DO_NOTHING, null=True)
+
",python
+"How to access all the child table entries based on parent table entry in Flask-SQLAlchemyI am developing a flask-based website, I have two tables in the database, Customer(PK=sno, ...) and Item(PK=iid, ..., FK=customer.sno).
+I want to display all the Items corresponding to any customer at a time when I click Bill button corresponding to the customer.
+File app.py contains code
+@app.route("/show_bill/<int:sno>")
+def show_bill(sno):
+ customer = Customer.query.filter_by(sno=sno).first()
+ show_bill = Item.query.filter_by(cid=sno)
+ return render_template('show_bill.html', customer=customer, show_bill=show_bill)
+
+And, show_bill.html contains
+<div class="container my-3">
+ <h2>Bill for customer {{ customer.cname }}</h2>
+ {% if show_bill|length == 0 %}
+ <div class="alert alert-dark" role="alert">
+ No Item found. Add first Item now to appear for this customesr!
+ </div>
+ {% else %}
+ <table class="table">
+ <thead>
+ <tr>
+ <th scope="col">SNo</th>
+ <th scope="col">Customer ID</th>
+ <th scope="col">Name of Item</th>
+ <th scope="col">No. of Items</th>
+ <th scope="col">Rate</th>
+ <th scope="col">Discount(%)</th>
+ <th scope="col">Time</th>
+ <th scope="col">Actions(Edit)</th>
+ </tr>
+ </thead>
+
+ <tbody>
+ {% for item in show_bill %}
+ <tr>
+ <th scope="row">{{loop.index}}</th>
+ <td>{{item.cid}}</td>
+ <td>{{item.iname}}</td>
+ <td>{{item.icount}}</td>
+ <td>{{item.irate}}</td>
+ <td>{{item.idiscount}}</td>
+ <td>{{item.date_created}}</td>
+ <td>
+ <a href="/update_item/{{item.iid}}" type="button" class="btn btn-outline-dark btn-sm mx-1">Update</button>
+ <a href="/delete_item/{{item.iid}}" type="button" class="btn btn-outline-dark btn-sm mx-1">Delete</button>
+ </td>
+ </tr>
+ {% endfor %}
+ </tbody>
+ </table>
+ {% endif %}
+</div>
+
+But, on clicking Bill button, I get an error:
+TypeError: object of type 'BaseQuery' has no len()
+Also if I remove show_bill=show_bill, It renders to show_bill.html page but do not show data as we are not fetching any (Probably the problem is here).
+Can someone help to resolve this? Or any alternative way to get the same thing.
+Thanks in Advance.
","You need to add the .all() query to return a list of all the desired items in your table, like this:
+show_bill = Item.query.filter_by(cid=sno).all()
+
+Without this part, SQL returns your query as an object instead of an iterable list.
",python
+"I want to make button commands in one class in python tkinterI am getting the following error:
+log_print() missing 1 required positional argument: 'self'
+I would really appreciate it if you could tell me how to fix it.
+'''
+import tkinter as tk
+class SimpleApp(tk.Frame):
+ def __init__(self, parent):
+ tk.Frame.__init__(self, parent)
+ self.parent = parent
+ self.btn1 = tk.Button(self.parent, text="start", width=10, height=5, command = Button1.log_print)
+ self.btn2 = tk.Button(self.parent, text="stop", width=10, height=5)
+ self.textbox = tk.Text(self.parent, height = 10)
+ self.btn1.grid(row = 0, column = 0,sticky = "news", padx= 5)
+ self.btn2.grid(row = 0, column = 1, sticky = "news", padx= 5)
+ self.textbox.grid(row = 1, column = 0, columnspan = 2, sticky = "news", padx= 5)
+
+
+class Button1(SimpleApp):
+ def __init__(self, parent, textbox):
+ SimpleApp.__init__(self, parent, textbox)
+
+ def log_print(self):
+ self.textbox.insert("end", "1")
+ self.textbox.update()
+ self.textbox.see("end")
+
+
+
+if __name__ == "__main__":
+ root = tk.Tk()
+ SimpleApp(root).grid()
+ root.mainloop()
+
+'''
","You mean something like this ?
+import tkinter as tk
+
+
+class SimpleApp(tk.Frame):
+ def __init__(self, parent):
+ tk.Frame.__init__(self, parent)
+ self.parent = parent
+ self.textbox = tk.Text(self.parent, height=10)
+ self.btn1 = tk.Button(self.parent, text="start", width=10, height=5, command=Button1(self).log_print)
+ self.btn2 = tk.Button(self.parent, text="stop", width=10, height=5)
+ self.btn1.grid(row=0, column=0, sticky="news", padx=5)
+ self.btn2.grid(row=0, column=1, sticky="news", padx=5)
+ self.textbox.grid(row=1, column=0, columnspan=2, sticky="news", padx=5)
+
+
+class Button1:
+ def __init__(self, simple_app_object):
+ self.simple_app = simple_app_object
+
+ def log_print(self):
+ self.simple_app.textbox.insert("end", "1")
+ self.simple_app.textbox.update()
+ self.simple_app.textbox.see("end")
+
+
+if __name__ == "__main__":
+ root = tk.Tk()
+ SimpleApp(root).grid()
+ root.mainloop()
+
",python
+"get specific lags from plot_acf that are over the confidence intervalHow do I get from this plot
+import pandas as pd
+import matplotlib.pyplot as plt
+import statsmodels.api as sm
+
+dta = sm.datasets.sunspots.load_pandas().data
+dta.index = pd.Index(sm.tsa.datetools.dates_from_range('1700', '2008'))
+del dta["YEAR"]
+sm.graphics.tsa.plot_acf(dta.values.squeeze(), lags=40)
+plt.show()
+
+the specific lags that are over/under the confidence interval?
","You can use acf rather than the plot interface to get the numerical values. The key step is to center the confidence interval by subtracting the ACF from the confidence interval so that it is centered at 0. The CI that is returned from acf is centered around the estimated ACF value.
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+import statsmodels.api as sm
+from statsmodels.tsa.stattools import acf
+
+dta = sm.datasets.sunspots.load_pandas().data
+dta.index = pd.Index(sm.tsa.datetools.dates_from_range('1700', '2008'))
+nlags = 40
+alpha = 0.05
+fft=True
+adjusted = False
+missing = "none"
+bartlett_confint = True
+x = dta.SUNACTIVITY
+a, ci = acf(x, nlags=40, alpha=0.05)
+# Key step
+centered_ci = ci - a[:,None]
+outside = np.abs(a) >= centered_ci[:,1]
+inside = ~outside
+
+print(outside)
+
+which shows
+[ True True True False True True True False False True True True
+ True False False True True True False False True True True False
+ False False True True False False False False False False False False
+ False False False False False]
+
",python
+"Empty plots when trying to adapt parallel coordinates example to my dataI'm trying to redo Parallel Coordinates in Altair but unfortunately i can't edit it so that it will work for me. When I run the code below, the plots show up empty without any lines. Could you please provide a pre-defined structure (perhaps with some explanation for beginners like me) so that we can change this code to pass it to our own goals . tnx.
+from sklearn import datasets
+data_wine = datasets.load_wine (as_frame = True).frame
+new_data = data_wine.drop (['proline', 'magnesium'], axis = 1)
+new_data = new_data.reset_index().melt(id_vars = ['index', 'target'])
+base = alt.Chart(
+ new_data
+).transform_window(
+ index="count()"
+).transform_fold(
+ #[alcohol malic_acid ash alcalinity_of_ash magnesium total_phenols flavanoids nonflavanoid_phenols proanthocyanins color_intensity hue od280/od315_of_diluted_wines "proline "]
+ ["alcohol","malic_acid","ash","alcalinity_of_ash","total_phenols","flavanoids","nonflavanoid_phenols","proanthocyanins","color_intensity","hue","od280/od315_of_diluted_wines"]
+).transform_joinaggregate(
+ min="min(value)",
+ max="max(value)",
+ groupby=["variable"]
+).transform_calculate(
+ norm_val="(datum.variable - datum.min) / (datum.max - datum.min)",
+ mid="(datum.min + datum.max) / 2"
+).properties(width=600, height=300)
+
+lines = base.mark_line(opacity=0.3).encode(
+ alt.Color ('target:N'),
+ alt.Detail ('index:N'),
+ x='variable:N',
+ y=alt.Y('norm_val:Q', axis=None),
+ #tooltip=["petalLength:N", "petalWidth:N", "sepalLength:N", "sepalWidth:N"]
+)
+
+rules = base.mark_rule(
+ color="#ccc", tooltip=None
+).encode(
+ x="variable:N",
+ detail="count():Q",
+)
+
+def ytick(yvalue, field):
+ scale = base.encode(x='variable:N', y=alt.value(yvalue), text=f"min({field}):Q")
+ return alt.layer(
+ scale.mark_text(baseline="middle", align="right", dx=-5, tooltip=None),
+ scale.mark_tick(size=8, color="#ccc", orient="horizontal", tooltip=None)
+ )
+
+alt.layer(
+ lines, rules, ytick(0, "max"), ytick(150, "mid"), ytick(300, "min")
+).configure_axisX(
+ domain=False, labelAngle=0, tickColor="#ccc", title=None
+).configure_view(
+ stroke=None
+)
+
","The reason your plots are not showing up is because your input data does not have the same structure as in the example you are following. You have melted your wide data frame in pandas to long format. This is the same functionality performed by the transform_fold function in Altair, so in your example you are trying to do this twice. Below I have removed the manual pandas melt and changed the variable names back to the ones autmatically assigned by transform_fold (key and value):
+from sklearn import datasets
+import altair as alt
+
+data_wine = datasets.load_wine (as_frame = True).frame
+new_data = data_wine.drop (['proline', 'magnesium'], axis = 1)
+
+base = alt.Chart(
+ new_data
+).transform_window(
+ index="count()"
+).transform_fold(
+ ["alcohol","malic_acid","ash","alcalinity_of_ash","total_phenols","flavanoids","nonflavanoid_phenols","proanthocyanins","color_intensity","hue","od280/od315_of_diluted_wines"]
+).transform_joinaggregate(
+ min="min(value)",
+ max="max(value)",
+ groupby=["key"]
+).transform_calculate(
+ norm_val="(datum.value - datum.min) / (datum.max - datum.min)",
+ mid="(datum.min + datum.max) / 2"
+).properties(width=1200, height=300)
+
+lines = base.mark_line(opacity=0.3).encode(
+ x='key:N',
+ y=alt.Y('norm_val:Q', axis=None),
+ color=alt.Color ('target:N'),
+ detail=alt.Detail ('index:N'),
+)
+
+rules = base.mark_rule(
+ color="#ccc", tooltip=None
+).encode(
+ x="key:N",
+ detail="count():Q",
+)
+
+def ytick(yvalue, field):
+ scale = base.encode(x='key:N', y=alt.value(yvalue), text=f"min({field}):Q")
+ return alt.layer(
+ scale.mark_text(baseline="middle", align="right", dx=-5, tooltip=None),
+ scale.mark_tick(size=8, color="#ccc", orient="horizontal", tooltip=None)
+ )
+
+alt.layer(
+ lines, rules, ytick(0, "max"), ytick(150, "mid"), ytick(300, "min")
+).configure_axisX(
+ domain=False, labelAngle=0, tickColor="#ccc", title=None
+).configure_view(
+ stroke=None
+)
+
+![]()
+You could create a simpler parallel coordinates plot like this if you are OK with not having a separate y-axis for each column in the data frame:
+from sklearn import datasets
+import altair as alt
+
+data = datasets.load_wine (as_frame=True).frame
+
+num_cols = ["alcohol","malic_acid","ash","alcalinity_of_ash","total_phenols","flavanoids","nonflavanoid_phenols","proanthocyanins","color_intensity","hue","od280/od315_of_diluted_wines"]
+
+# You could skip this rescaling but it would compress the y-axis range for columns with smaller absolute values
+data[num_cols] = data[num_cols].apply(lambda x: (x - x.min()) / (x.max() - x.min()))
+
+alt.Chart(data).transform_window(
+ index='count()'
+ ).transform_fold(
+ num_cols
+ ).mark_line().encode(
+ alt.X('key:O', title=None, scale=alt.Scale(nice=False, padding=0.05)),
+ alt.Y('value:Q', title=None),
+ alt.Color('target:N', title=None),
+ detail='index:N'
+).properties(
+ width=1200
+)
+
+![]()
+If you are using this for exploratory data analysis and don't need to customize the plot a lot, then you can also try out my experimental package [altair_ally][3] for quickly creating some common exploratory plots:
+from sklearn import datasets
+import altair_ally as aly
+
+
+data_wine = datasets.load_wine (as_frame = True).frame
+data_wine['target'] = data_wine['target'].astype(str)
+aly.parcoord(data_wine, color='target')
+
+![]()
",python
+"Pandas - concating multi-indexed dataframes keeps duplicate indizesHello I'm trying to read in multiple dfs of the same structre and concating them into a single one however the combined df somehow keeps duplicates in the indizes....
+df1 = processUSAdata(2223)
+df2 = processUSAdata(2224)
+
+print(df1)
+print(df2)
+
+test = pd.concat([df1,df2])
+test.set_index(['state','year'],inplace=True)
+print(test)
+
+outputs
+ state population violent_crime ... theft gta year
+0 ALABAMA 4903185 510.81 ... 1886.06 256.51 2223
+1 ALASKA 731545 867.07 ... 2066.04 357.74 2223
+2 ARIZONA 7278717 455.31 ... 1796.86 249.37 2223
+3 ARKANSAS 3017804 584.63 ... 2012.56 245.87 2223
+4 CALIFORNIA 39512223 441.21 ... 1586.35 358.77 2223
+5 COLORADO 5758736 380.95 ... 1858.26 383.99 2223
+6 CONNECTICUT 3565287 183.60 ... 1078.65 167.28 2223
+
+[7 rows x 12 columns]
+ state population violent_crime ... theft gta year
+0 ALABAMA 4903185 510.81 ... 1886.06 256.51 2224
+1 ALASKA 731545 867.07 ... 2066.04 357.74 2224
+2 ARIZONA 7278717 455.31 ... 1796.86 249.37 2224
+3 ARKANSAS 3017804 584.63 ... 2012.56 245.87 2224
+4 CALIFORNIA 39512223 441.21 ... 1586.35 358.77 2224
+5 COLORADO 5758736 380.95 ... 1858.26 383.99 2224
+6 CONNECTICUT 3565287 183.60 ... 1078.65 167.28 2224
+
+[7 rows x 12 columns]
+ population violent_crime murder ... burglary theft gta
+state year ...
+ALABAMA 2223 4903185 510.81 7.30 ... 531.88 1886.06 256.51
+ALASKA 2223 731545 867.07 9.43 ... 487.05 2066.04 357.74
+ARIZONA 2223 7278717 455.31 5.01 ... 394.29 1796.86 249.37
+ARKANSAS 2223 3017804 584.63 8.02 ... 599.61 2012.56 245.87
+CALIFORNIA 2223 39512223 441.21 4.28 ... 386.10 1586.35 358.77
+COLORADO 2223 5758736 380.95 3.79 ... 348.41 1858.26 383.99
+CONNECTICUT 2223 3565287 183.60 2.92 ... 180.66 1078.65 167.28
+ALABAMA 2224 4903185 510.81 7.30 ... 531.88 1886.06 256.51
+ALASKA 2224 731545 867.07 9.43 ... 487.05 2066.04 357.74
+ARIZONA 2224 7278717 455.31 5.01 ... 394.29 1796.86 249.37
+ARKANSAS 2224 3017804 584.63 8.02 ... 599.61 2012.56 245.87
+CALIFORNIA 2224 39512223 441.21 4.28 ... 386.10 1586.35 358.77
+COLORADO 2224 5758736 380.95 3.79 ... 348.41 1858.26 383.99
+CONNECTICUT 2224 3565287 183.60 2.92 ... 180.66 1078.65 167.28
+
+Hoping someone can help me out here.
+Thanks in advance! :)
","It's probably because your index is not sorted:
+test = pd.concat([df1, df2]).set_index(['state', 'year']).sort_index()
+print(test)
+
+# Output
+ population violent_crime theft gta
+state year
+ALABAMA 2223 4903185 510.81 1886.06 256.51
+ 2224 4903185 510.81 1886.06 256.51
+ALASKA 2223 731545 867.07 2066.04 357.74
+ 2224 731545 867.07 2066.04 357.74
+ARIZONA 2223 7278717 455.31 1796.86 249.37
+ 2224 7278717 455.31 1796.86 249.37
+ARKANSAS 2223 3017804 584.63 2012.56 245.87
+ 2224 3017804 584.63 2012.56 245.87
+CALIFORNIA 2223 39512223 441.21 1586.35 358.77
+ 2224 39512223 441.21 1586.35 358.77
+COLORADO 2223 5758736 380.95 1858.26 383.99
+ 2224 5758736 380.95 1858.26 383.99
+CONNECTICUT 2223 3565287 183.60 1078.65 167.28
+ 2224 3565287 183.60 1078.65 167.28
+
",python
+"How to rename a subset of columns based on offset/index and variable range?I am working with student test data. The data provided is in a new format and I need to align it with the older format for an existing BI application. Where a range of columns used to contain questions numbers, the column name now contains the correct answer (this includes duplicate column names as imported form the source XLSX - see the image below). Different year levels have a different number of questions (so the "Total" column is not fixed. I need to rename the answer columns back to the sequential question numbers starting at 1. What is the best way to achieve this?
+NB the sample df is not quite right as there are duplicate column names as the column name represents the correct answer. I cannot provide a sample df without importing it from a CSV/XLSX.
+Updated with some sample df data:
+
+
+
data = {
+ 'StudentID': [10, 11, 12, 13],
+ 'Year' : [2021,2021,2021,2021],
+ 'TestName': ['Math83', 'Math83','Math83','Math83'],
+ 'A' : ['C','A','C','B'],
+ 'B' : ['D','C','C','C'],
+ 'C' : ['D','D','C','D'],
+ 'D' : ['B','C','C','C'],
+ 'Total': [5,4,3,5,],
+ 'Score': [3,3,4,2,],
+ 'Error': [1,2,1,1]
+ }
+df = pd.DataFrame(data)
+
+
+
+![]()
","Here is a solution using set_axis()
+cols = df.columns
+tn = cols.get_loc('TestName')+1
+total = cols.get_loc('Total')
+
+(df.set_axis(cols[:tn].tolist() +
+ list(range(1,len(df.columns[tn:total+1]))) +
+ cols[total:].tolist(),axis=1))
+
+Output:
+ StudentID Year TestName 1 2 3 4 Total Score Error
+0 10 2021 Math83 C D D B 5 3 1
+1 11 2021 Math83 A C D C 4 3 2
+2 12 2021 Math83 C C C C 3 4 1
+3 13 2021 Math83 B C D C 5 2 1
+
",python
+"How do you add dataclasses as valid index values to a plotly chart?I am trying to switch from the matplotlib pandas plotting backend to plotly. However, I am being held back by a common occurrence of this error:
+TypeError: Object of type Quarter is not JSON serializable
+
+Where Quarter is a dataclass in my codebase.
+For a minimal example, consider:
+@dataclass
+class Foo:
+ val:int
+
+df = pd.DataFrame({'x': [Foo(i) for i in range(10)], 'y':list(range(10))})
+
+df.plot.scatter(x='x', y='y')
+
+As expected, the above returns:
+TypeError: Object of type Foo is not JSON serializable
+
+Now, I don't expect plotly to be magical, but adding a __float__ magic method allows the Foo objects to be used with the matplotlib backend:
+# This works
+@dataclass
+class Foo:
+ val:int
+
+ def __float__(self):
+ return float(self.val)
+
+df = pd.DataFrame({'x': [Foo(i) for i in range(10)], 'y':list(range(10))})
+
+df.plot.scatter(x='x', y='y')
+
+How can I update my dataclass to allow for it to be used with the plotly backend?
","You can get pandas to cast to float before invoking plotting backend.
+from dataclasses import dataclass
+import pandas as pd
+
+@dataclass
+class Foo:
+ val:int
+
+ def __float__(self):
+ return float(self.val)
+
+df = pd.DataFrame({'x': [Foo(i) for i in range(10)], 'y':list(range(10))})
+df["x"].astype(float)
+
+pd.options.plotting.backend = "plotly"
+
+df.assign(x=lambda d: d["x"].astype(float)).plot.scatter(x='x', y='y')
+
+monkey patching
+
+from dataclasses import dataclass
+import pandas as pd
+import wrapt, json
+import plotly
+
+@wrapt.patch_function_wrapper(plotly, 'plot')
+def new_plot(wrapped, instance, args, kwargs):
+ try:
+ json.dumps(args[0][kwargs["x"]])
+ except TypeError:
+ args[0][kwargs["x"]] = args[0][kwargs["x"]].astype(float)
+ return wrapped(*args, **kwargs)
+
+
+
+@dataclass
+class Foo:
+ val:int
+
+ def __float__(self):
+ return float(self.val)
+
+df = pd.DataFrame({'x': [Foo(i) for i in range(10)], 'y':list(range(10))})
+df["x"].astype(float)
+
+pd.options.plotting.backend = "plotly"
+
+df.plot.scatter(x='x', y='y')
+
+
+
",python
+"Same code does not work in venv despite installing dependencies from same requirements.txtI am developing a web application and trying to migrate from Spyder to VS Code.
+It was working with the default interpreter, so I created a new venv but when I start the server it does not work with the same code that was working without the venv.
+Error description:
+File "C:\Users\User\Desktop\Flask\app.py", line 77, in index
+measurement_mx = rs.all()
+AttributeError: 'ResultProxy' object has no attribute 'all'
+
+I installed exactly the same dependencies with pip install -r requirements.txt.
+Can you help me what the solution could be, I could not find this problem unfortunately.
+Relevant code snippet:
+@app.route('/', methods=['POST', 'GET'])
+def index():
+ with engine.connect() as con:
+ rs = con.execute(SQL_string)
+ measurement_mx = rs.all() #this is the error line
+ measurement_list = []
+ for row in measurement_mx:
+ measurement_list.append(row._data)
+ measurement_list = transpose(measurement_list)
+ return render_template('index.html', measurement_list=measurement_list )
+
+
+Thank you in advance!
","Welcome to StackOverflow!
+The issue could be that your requirements.txt file may specify what packages to install, but not their exact version, so could you please:
+
+- paste the content of your requirements file?
+- check the packages versions between your two virtual environments?
+
+Some thoughts:
+
+Now, concerning your exact issue: ResultProxy is not an object from flask but from SQLAlchemy, which SQLAchemy v1.4 replaced:
+
+class sqlalchemy.engine.Result(cursor_metadata)
+Represent a set of database results.
+
+New in version 1.4: The Result object provides a completely updated usage model and calling facade for SQLAlchemy Core and SQLAlchemy ORM. In Core, it forms the basis of the CursorResult object which replaces the previous ResultProxy interface. When using the ORM, a higher level object called ChunkedIteratorResult is normally used.
+
+
+(emphasis mine)
+Which means you can:
+
+- either fix
SQLAlchemy version in your requirements.txt file,
+- or update your code to cope with current
SQLAlchemy syntax.
+
",python
+"Pandas Dataframe: Change each value's ones-digitI am writing unit tests for 2 data frames to test for equality by converting them to dictionaries and using unittest's assertDictEqual(). The context is that I'm converting Excel functions to Python but due to their different rounding system, some values are off by merely +/- 1
+I've attempted to use the DF.round(-1) to round to the nearest 10th but due to the +/- 1, some numbers may round the opposite way so for example 15 would round up but 14 would round down and the test would fail. All values in the 12x20 data frame are integers
+What I'm looking for (feel free to suggest any alternate solution):
+
+- A CLEAN way to test for approximate equality of data frames or nested dictionaries
+- or a way to make the ones-digit of each element '0' to avoid the rounding issue
+
+Thank you, and please let me know if any additional context is required. Due to confidentiality issues and my NDA (non-disclosure agreement), I cannot share the code but I can formulate an example if necessary
","You could take the element-wise absolute difference between the two DataFrames and check that all values are below a certain tolerance (in your case 1). For example, we can create two DataFrames with values in the interval [0.0, 1.0).
+import numpy as np
+import pandas as pd
+
+np.random.seed(42)
+
+## df2 are 10x10 arrays with values in the interval [0.0, 1.0)
+df1 = pd.DataFrame(np.random.random_sample((10,10)))
+df2 = pd.DataFrame(np.random.random_sample((10,10)))
+
+Then the following should return True:
+(abs(df2-df1) < 1).all(axis=None)
+
+And you can write an assert statement like:
+assert((abs(df2-df1) < 1).all(axis=None) == True)
+
",python
+"How to deal with ImmutableMultiDictI'm trying to test my flask API with a POST request, but I have problems to deal with the ImmutableMultiDict.
+The API:
+@app.route("/df/", methods = ['GET', 'POST'])
+def get_df():
+ if request.method == 'POST':
+ select = request.form
+
+request:
+curl -X POST http://127.0.0.1:5000/df/ -d '{"ticker": "ETH-PERP"}'
+
+print(select):
+ImmutableMultiDict([('{"ticker": "ETH-PERP"}', '')])
+
+How do I access the value ("ETH-PERP") ?
+I tried:
+select = request.form.getlist("ticker")
+
+output: []
+select = request.form.to_dict(flat=True)
+print(select[0])
+
+output: keyError: 0
+select = request.form.to_dict().values()[0]
+
+output: 'dict_values' object is not subscriptable
+any suggestions?
","The problem is that the request data is not a form, it's JSON.
+You should probably use request.json to read it, which should give you a regular dictionary.
",python
+"Is there a way to convert array of bytes to array of floatsI did a python script that gets data from shared mem and convert it from bytes to floats.
+The main problem is that it very slow.
+This is how I init the shared memory:
+ def _shared_mem_init(self):
+
+ warnings.filterwarnings("ignore")
+ path = "/tmp"
+
+ # shared memory header
+ key = ipc.ftok(path, 0x3110)
+ self.shm = ipc.SharedMemory(key, 0, 0)
+ self.shm.attach(0, 0)
+ # shared memory X values
+ key_x = ipc.ftok(path, 0x3111)
+ self.shm_x = ipc.SharedMemory(key_x, 0, 0)
+ self.shm_x.attach(0, 0)
+ # shared memory Y values
+ key_y = ipc.ftok(path, 0x3112)
+ self.shm_y = ipc.SharedMemory(key_y, 0, 0)
+ self.shm_y.attach(0, 0)
+ # shared memory Z values
+ key_z = ipc.ftok(path, 0x3113)
+ self.shm_z = ipc.SharedMemory(key_z, 0, 0)
+ self.shm_z.attach(0, 0)
+ # shared memory R values
+ key_r = ipc.ftok(path, 0x3114)
+ self.shm_r = ipc.SharedMemory(key_r, 0, 0)
+ self.shm_r.attach(0, 0)
+ # shared memory G values
+ key_g = ipc.ftok(path, 0x3115)
+ self.shm_g = ipc.SharedMemory(key_g, 0, 0)
+ self.shm_g.attach(0, 0)
+ # shared memory B values
+ key_b = ipc.ftok(path, 0x3116)
+ self.shm_b = ipc.SharedMemory(key_b, 0, 0)
+ self.shm_b.attach(0, 0)
+
+ self.shm.write(byte_true, 0)
+ print("shared Memory init")
+
+it gets the xyzrgb from the shared memory.
+and after I get the data I try to convert it from bytes to floats:
+ def next_point_cloud(self):
+
+ # read 4 bytes from header - Data Lines
+ buf = self.shm.read(4, 5)
+ data_lines = int.from_bytes(buf, "little")
+
+ # read all data
+ buff_x2 = self.shm_x.read(4 * data_lines, 0)
+ buff_y2 = self.shm_y.read(4 * data_lines, 0)
+ buff_z2 = self.shm_z.read(4 * data_lines, 0)
+ buff_r2 = self.shm_r.read(data_lines, 0)
+ buff_g2 = self.shm_g.read(data_lines, 0)
+ buff_b2 = self.shm_b.read(data_lines, 0)
+
+ # split all data
+ buff_x_breakdown = [buff_x2[i:i + 4] for i in range(0, 4 * data_lines, 4)]
+ buff_y_breakdown = [buff_y2[i:i + 4] for i in range(0, 4 * data_lines, 4)]
+ buff_z_breakdown = [buff_z2[i:i + 4] for i in range(0, 4 * data_lines, 4)]
+ buff_r_breakdown = [buff_r2[i] for i in range(0, data_lines, 1)]
+ buff_g_breakdown = [buff_g2[i] for i in range(0, data_lines, 1)]
+ buff_b_breakdown = [buff_b2[i] for i in range(0, data_lines, 1)]
+
+ xyz = np.zeros((data_lines, 3))
+ colors = np.zeros((data_lines, 3))
+
+ for i in range(data_lines):
+ xyz[i, 0] = struct.unpack('f', buff_x_breakdown[i])[0]
+ xyz[i, 1] = struct.unpack('f', buff_y_breakdown[i])[0]
+ xyz[i, 2] = struct.unpack('f', buff_z_breakdown[i])[0]
+
+ colors[i, 0] = float(buff_r_breakdown[i]) / 255.0
+ colors[i, 1] = float(buff_g_breakdown[i]) / 255.0
+ colors[i, 2] = float(buff_b_breakdown[i]) / 255.0
+
+ self.pcdA.points = o3d.utility.Vector3dVector(xyz)
+ self.pcdA.colors = o3d.utility.Vector3dVector(colors)
+
+
+So my question is: Is there a way in python to write a code that runs better then the for loop that I wrote?
","You can use np.frombuffer to construct a Numpy array from a bytes object:
+>>> buffer = b''.join(struct.pack('f', x) for x in [1.0, 2.0, 3.0])
+>>> np.frombuffer(buffer, dtype=np.float32, count=3)
+array([1., 2., 3.], dtype=float32)
+
+So in your case this should be:
+xyz = np.stack(
+ [
+ np.frombuffer(self.shm_x.read(4*data_lines, 0), dtype=np.float32, count=data_lines),
+ np.frombuffer(self.shm_y.read(4*data_lines, 0), dtype=np.float32, count=data_lines),
+ np.frombuffer(self.shm_z.read(4*data_lines, 0), dtype=np.float32, count=data_lines),
+ ],
+ axis=1,
+)
+
+and the same for colors:
+colors = (1./255) * np.stack(
+ [
+ np.frombuffer(self.shm_r.read(data_lines, 0), dtype=np.byte, count=data_lines),
+ np.frombuffer(self.shm_g.read(data_lines, 0), dtype=np.byte, count=data_lines),
+ np.frombuffer(self.shm_b.read(data_lines, 0), dtype=np.byte, count=data_lines),
+ ],
+ axis=1,
+)
+
",python
+"Python program to convert infix to prefix notationI keep on getting an IndexError: list index out of range, return self.data[-1] # the last element in the list; I think I know what is causing this but I have no clue how to fix it
+Here is the Stack Class I used:
+class Stack:
+ # LIFO Stack implementation using a Python list as underlying storage.
+ def __init__(self):
+ self.data =[]
+
+ def __len__(self):
+ return len(self.data)
+
+ def is_empty(self):
+ return len(self.data)==0
+
+ def push(self, e):
+ self.data.append(e)
+
+ def top(self):
+ return self.data[-1]
+
+ def pop(self):
+ return self.data.pop()
+
+And the corresponding code I made:
+def operatorpriority(x):
+ if x == "+" or x == "-":
+ return 1
+ elif x == "*" or x == "/":
+ return 2
+ else:
+ return 3
+ return 0
+
+def polishnotation(A):
+ # Converts Infix to Prefix Notation
+ stack = Stack()
+ stack.push(')')
+ A = A + '('
+ output = ""
+ for i in range(len(A)-1, -1, -1):
+ print(i)
+ if A[i].isnumeric() == True:
+ output+=A[i]
+ elif A[i] == ")":
+ stack.push(A[i])
+ elif A[i] == "-" or A[i] == "+" or A[i] == "*" or A[i] == "/" or A[i] == "^":
+ if A[i] == "^":
+ while operatorpriority(A[i]) <= operatorpriority(stack.top()):
+ output+=stack.pop()
+ else:
+ while operatorpriority(A[i]) < operatorpriority(stack.top()):
+ output+=stack.pop()
+ stack.push(A[i])
+ elif A[i] == "(":
+ while stack.is_empty()== False:
+ if stack.top() != "(":
+ output+=stack.pop()
+ stack.pop()
+ while stack.is_empty()== False:
+ output+=stack.pop()
+ print(output)
+
+
+
+InfixInput = input("Input infix notation: ")
+polishnotation(InfixInput)
+Sample Input:
+(a+b)*(c-d)
+Expected Output:
+*+ab-cd
","
+- You have
A = A + '('. That adds at the wrong end. Just do A = '('+A+')' and skip the extra push.
+- You are giving ')' the same priority as '^'. In
operatorpriority, your else: should be elif x =='^':.
+- In your
elif A[i] == "(" clause, you are popping until '('. That's the wrong type of parens. And you don't break out of that loop until the stack is empty. You need to break when you get to a ')'.
+- Your example shows
(a+b)*(c+d), but your code only allows digits. I haven't changed that.
+
+This works:
+class Stack:
+ # LIFO Stack implementation using a Python list as underlying storage.
+ def __init__(self):
+ self.data =[]
+
+ def __len__(self):
+ return len(self.data)
+
+ def is_empty(self):
+ return len(self.data)==0
+
+ def push(self, e):
+ self.data.append(e)
+
+ def top(self):
+ return self.data[-1]
+
+ def pop(self):
+ return self.data.pop()
+
+def operatorpriority(x):
+ if x in "+-":
+ return 1
+ elif x in "*/":
+ return 2
+ elif x in "^":
+ return 3
+ return 0
+
+def polishnotation(A):
+ # Converts Infix to Prefix Notation
+ stack = Stack()
+ A = '(' + A + ')'
+ output = ""
+ for c in A[::-1]:
+ print(c)
+ if c.isnumeric():
+ output+=c
+ elif c == ")":
+ stack.push(c)
+ elif c in "+-*/^":
+ if c == "^":
+ while operatorpriority(c) <= operatorpriority(stack.top()):
+ output+=stack.pop()
+ else:
+ while operatorpriority(c) < operatorpriority(stack.top()):
+ output+=stack.pop()
+ stack.push(c)
+ elif c == "(":
+ while not stack.is_empty():
+ c1 = stack.pop()
+ if c1 == ')':
+ break
+ output+=c1
+ while not stack.is_empty():
+ output+=stack.pop()
+ return output
+
+print(polishnotation('(3+4)*(5+6)'))
+
",python
+"selenium.common.exceptions.NoSuchElementException error sending text to input fields using Selenium and PythonI'm trying to write a simple program to fill out a form (including order ID and zip code) to be submitted but I keep getting the following error:
+selenium.common.exceptions.NoSuchElementException: Message: " (without any text following "Message
+
+Code trials:
+from selenium import webdriver
+browser = webdriver.Safari()
+browser.get('https://knowledge.tonal.com/s/order-status')
+
+orderElm = browser.find_element_by_id('input-3')
+orderElm.send_keys('1000XXX')
+
+zipcodeElm = browser.find_element_by_id('input-4')
+zipcodeElm.send_keys('90210')
+zipcodeElm.submit()
+
+I've double-checked my element ID several times and though I'm very new to this, I'm fairly confident I have the correct element IDs. What am I doing incorrectly?
","To send a character sequence to the element you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
+
+Using XPATH:
+driver.get("https://knowledge.tonal.com/s/order-status")
+WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@data-aura-class='cOrderSearch']//following::input[1]"))).send_keys('1000XXX')
+driver.find_element(By.XPATH, "//div[@data-aura-class='cOrderSearch']//following::input[2]").send_keys("90210")
+
+
+Note: You have to add the following imports :
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.common.by import By
+from selenium.webdriver.support import expected_conditions as EC
+
+
+Browser Snapshot:
+
+
+![]()
",python
+"How to reformat and modify a date string according to DIN standard: 5008:2020-03?How can I reformat and modify a date string?
+
+My code:
+details = response.xpath('//div[@class="article-info"]')
+for detail in details:
+ released = detail.xpath('.//ul/li[1]/span[1]/span/text()').get()
+ item['released'] = released
+ yield item
+
+My output:
+
+Released 2021, September 24
+
+
+Desired output:
+
+
+- September, 2021
+
+
+and:
+
+24.09.2021
+
+
","It has nothing to do with scrapy, as this is a simple string parsing problem.
+For example, you might want to try this, provided that released is a string value like Released 2021, September 24:
+import datetime
+
+released = "Released 2021, September 24".replace("Released ", "")
+print(datetime.datetime.strptime(released, '%Y, %B %d').strftime("%d.%m.%Y"))
+
+Output:
+24.09.2021
+
+To get the longer output:
+print(datetime.datetime.strptime(released, '%Y, %B %d').strftime("%d. %B, %Y"))
+
+Output:
+24. September, 2021
+
+EDIT:
+In order to apply it in your code, try this:
+details = response.xpath('//div[@class="article-info"]')
+for detail in details:
+ released = detail.xpath('.//ul/li[1]/span[1]/span/text()').get()
+ item['released'] = datetime.datetime.strptime(released.replace("Released ", ""), '%Y, %B %d').strftime("%d.%m.%Y")
+ yield item
+
",python
+"Tkinter IntVar attribute error when using radiobuttonsI am making a simple calculator using tkinter, and this error message keeps showing up.
+
+AttributeError: type object 'Tk' has no attribute 'IntVar'
+
+Traceback:
+
+File "c:/Desktop/my programs/Calculator.py", line 9, in
+v = Tk.IntVar()
+AttributeError: type object 'Tk' has no attribute 'IntVar'
+
+here is my code:
+import tkinter as tk
+from tkinter import *
+import time
+
+root = Tk()
+tk = Tk()
+v = Tk.IntVar()
+v.set(1) # initializing the choice
+root.title('Calculator')
+root.geometry('600x400+50+50')
+def calucator():
+ value1 = tk.StringVar(root, Value= 'Number:' )
+ value1 = tk.Entry(root, textvariable=value1).pack()
+ time.sleep(1)
+ value2 = tk.StringVar(root, Value= 'Number:' )
+ value2 = tk.Entry(root, textvariable=value2).pack()
+ time.sleep(1)
+
+ operations = [("Addition", 101),
+ ("Subtraction", 102),
+ ("Divison", 103),
+ ("Mulitplication", 104)]
+
+ def ShowChoice():
+ print(v.get())
+
+ tk.Label(root,
+ text="""Choose your Operation""",
+ justify = tk.LEFT,
+ padx = 20).pack()
+
+ for operations, val in operations:
+ tk.Radiobutton(root,
+ text=operations,
+ padx = 20,
+ variable=v,
+ command=ShowChoice,
+ value=val).pack(anchor=tk.W)
+
+
+if val==101:
+ print('value1 + value2')
+
+elif val==102:
+ print('value1 - value2')
+
+elif val==103:
+ print('value1 * value2')
+
+elif val==104:
+ print('value1 / value2')
+
+else:
+ close
+
+I have looked everywhere and are yet to find a fix. I am aware that there are other threads on stack overflow with similar problems to mine however all the solutions I have tried have not worked.
+Any help would be great!
+mrt
","You've imported a module and named it tk. Then you create a new object and named it tk. You can only have one thing with that name.
+The solution is to choose a different name for the instance of Tk, or choose a different name for your import.
+You need to remove tk=Tk(). Also, v = Tk.IntVar() needs to be v = tk.IntVar().
",python
+"Map column lists to dictionary and create new column with padded stringsGiven this dataframe and word_index dictionary:
+import pandas as pd
+
+df = pd.DataFrame(data={'text_ids': [
+ [1, 2, 3, 2, 7, 2, 8, 2, 0],
+ [1, 2, 4, 2, 7, 2, 8, 2, 0],
+ [1, 2, 5, 2, 6, 2, 8, 2, 0],
+ [1, 2, 9, 2, 6, 2, 10, 2, 11, 2, 8, 0]
+ ]})
+
+word_index = {0: '<eos>', 1: '<sos>', 2: '/s', 3: 'he', 4: 'she', 5:'they', 6:'love', 7:'loves', 8: 'cats', 9: 'we', 10: 'talking', 11: 'about', 12: '<pad>'}
+
+How can I map each sequence in text_ids to its corresponding value(s) in word_index, while making sure that \s really creates spaces in each string? Also, I need to add <pad> tokens to each string that has a length smaller than the largest integer sequence.
+Expected output:
+ text_ids text
+0 [1, 2, 3, 2, 7, 2, 8, 2, 0] <sos> he loves cats <eos><pad><pad><pad>
+1 [1, 2, 4, 2, 7, 2, 8, 2, 0] <sos> she loves cats <eos><pad><pad><pad>
+2 [1, 2, 5, 2, 6, 2, 8, 2, 0] <sos> they love cats <eos><pad><pad><pad>
+3 [1, 2, 9, 2, 6, 2, 10, 2, 11, 2, 8, 0] <sos> we love talking about cats <eos>
+
","You could use map to assign the values from your dictionary. Ensure to first replace '\s' with ' '.
+Then reshape your dataframe to wide format with pivot to ensure the same number of items and fillna the missing spots with "<pad>".
+Finally aggregate to a string with apply and join to the original dataframe:
+word_index[2] = ' '
+
+df2 = df['text_ids'].explode().map(word_index).reset_index()
+
+df.join(
+ df2.assign(col=df2.groupby('index').cumcount())
+ .pivot('col', 'index', 'text_ids')
+ .fillna('<pad>')
+ .apply(''.join)
+ .rename('text')
+)
+
+output:
+ text_ids text
+0 [1, 2, 3, 2, 7, 2, 8, 2, 0] <sos> he loves cats <eos><pad><pad><pad>
+1 [1, 2, 4, 2, 7, 2, 8, 2, 0] <sos> she loves cats <eos><pad><pad><pad>
+2 [1, 2, 5, 2, 6, 2, 8, 2, 0] <sos> they love cats <eos><pad><pad><pad>
+3 [1, 2, 9, 2, 6, 2, 10, 2, 11, 2, 8, 0] <sos> we love talking about cats<eos>
+
+Another option using apply:
+word_index[2] = ' '
+
+# padding values
+l = df['text_ids'].str.len()
+pad = (l.max()-l).mul(pd.Series(['<pre>']*len(l)))
+
+df['text'] = df['text_ids'].apply(lambda s: ''.join(word_index[e] for e in s))+pad
+
",python
+"How to convert a dictionary according to a json scheme, Python3I have a json scheme, which specifies the format of a dictionary in Python 3.
+INPUT_SCHEME = {
+ "type": "object",
+ "properties": {
+ "a1": {
+ "type": "object",
+ "properties": {
+ "a1_1": {"type": ["string", "null"]},
+ "a1_2": {"type": ["number", "null"]},
+ },
+ "additionalProperties": False,
+ "minProperties": 2,
+ },
+ "a2": {
+ "type": "array",
+ "items": {"type": ["number", "null"]},
+ },
+ "a3": {
+ "type": ["number", "null"],
+ },
+ "a4": {
+ "type": "object",
+ "properties": {
+ "a4_1": {"type": ["string", "null"]},
+ "a4_2": {
+ "type": "object",
+ "properties": {
+ "a4_2_1": {"type": ["string", "null"]},
+ "a4_2_2": {"type": ["number", "null"]},
+ },
+ "additionalProperties": False,
+ "minProperties": 2,
+ },
+ },
+ "additionalProperties": False,
+ "minProperties": 2,
+ },
+ "a5": {
+ "type": "array",
+ "items": {
+ "type": "object",
+ "properties": {
+ "a5_1": {"type": ["string", "null"]},
+ "a5_2": {"type": ["number", "null"]},
+ },
+ "additionalProperties": False,
+ "minProperties": 2,
+ },
+ },
+ },
+ "additionalProperties": False,
+ "minProperties": 5,
+}
+
+And I want to write a function which can convert an arbitrary input dictionary to the format defined by the INPUT_SCHEME.
+The rules are:
+
+- if the input dict misses a filed, then fill the filed with None or empty list in the output dict.
+- if the input dict has a key that is not defined in the
INPUT_SCHEME, then remove it in the output dict.
+
+For example, suppose I have a_input, where only 'a1' is correct. 'a2', 'a3', and 'a4' are missing. Each element in 'a5' misses one property. And 'a6' is an un-defined field.
+The function I want to write should convert a_input to a_output. And you can use jsonschema.validate to check.
+a_input = {
+ 'a1': {'a1_1': 'apple', 'a1_2': 20.5},
+ 'a5': [{'a5_1': 'pear'}, {'a5_2': 18.5}],
+ 'a6': [1, 2, 3, 4],
+}
+
+a_output = {
+ 'a1': {'a1_1': 'apple', 'a1_2': 20.5},
+ 'a2': [],
+ 'a3': None,
+ 'a4': {
+ 'a4_1': None,
+ 'a4_2': {
+ 'a4_2_1': None,
+ 'a4_2_2': None,
+ }
+ },
+ 'a5': [
+ {
+ 'a5_1': 'pear',
+ 'a5_2': None,
+ },
+ {
+ 'a5_1': None,
+ 'a5_2': 18.5,
+ }
+ ]
+}
+
+jsonschema.validate(a_output, schema=INPUT_SCHEME)
+
+I tried to write the function, but could not make it. Mainly because there are too many if-else check plus the nested structure, and I got lost. Could you please help me?
+Thanks.
+def my_func(a_from):
+ a_to = dict()
+ for key_1 in INPUT_SCHEME['properties'].keys():
+ if key_1 not in a_from:
+ a_to[key_1] = None # This is incorrect, since the structure of a_to[key_1] depends on INPUT_SCHEME.
+ continue
+
+ layer_1 = INPUT_SCHEME['properties'][key_1]
+ if 'properties' in layer_1: # like a1, a4
+ for key_2 in layer_1['properties'].keys():
+ layer_2 = layer_1['properties'][key_2]
+ ...
+
+ # but it can be a nest of layers. Like a4, there are 3 layers. In real case, it can have more layers.
+
+ elif 'items' in layer_1:
+ if 'properties' in layer_1['items']: # like a5
+ ...
+ else: # like a2
+ ...
+ else: # like 3
+ ...
+ return a_to
+
","A recursive algorithm suits this.
+I divided it into 2 different functions as removing undefined properties and filling non-existent ones from the schema are 2 different tasks. You can merge them into one if you wish.
+For filling nonexistent properties, I just create arrays, objects and Nones, and then recurse inwards.
+For removing the undefined properties, I compare the schema keys and remove unmatched keys, again, recursing inwards.
+You may see comments and type checks in code:
+def fill_nonexistent_properties(input_dictionary, schema):
+ """
+ Fill missing properties in input_dictionary according to the schema.
+ """
+ properties = schema['properties']
+ missing_properties = set(properties).difference(input_dictionary)
+
+ # Fill all missing properties.
+ for key in missing_properties:
+ value = properties[key]
+ if value['type'] == 'array':
+ input_dictionary[key] = []
+ elif value['type'] == 'object':
+ input_dictionary[key] = {}
+ else:
+ input_dictionary[key] = None
+
+ # Recurse inside all properties.
+ for key, value in properties.items():
+
+ # If it's an array of objects, recurse inside each item.
+ if value['type'] == 'array' and value['items']['type'] == 'object':
+ object_list = input_dictionary[key]
+
+ if not isinstance(object_list, list):
+ raise ValueError(
+ f"Invalid JSON object: {key} is not a list.")
+
+ for item in object_list:
+ if not isinstance(item, dict):
+ raise ValueError(
+ f"Invalid JSON object: {key} is not a list of objects.")
+ fill_nonexistent_properties(item, value['items'])
+
+ # If it's an object, recurse inside it.
+ elif value['type'] == 'object':
+ obj = input_dictionary[key]
+ if not isinstance(obj, dict):
+ raise ValueError(
+ f"Invalid JSON object: {key} is not a dictionary.")
+ fill_nonexistent_properties(obj, value)
+
+def remove_undefined_properties(input_dictionary, schema):
+ """
+ Remove properties in input_dictionary that are not defined in the schema.
+ """
+ properties = schema['properties']
+ undefined_properties = set(input_dictionary).difference(properties)
+
+ # Remove all undefined properties.
+ for key in undefined_properties:
+ del input_dictionary[key]
+
+ # Recurse inside all existing sproperties.
+ for key, value in input_dictionary.items():
+ property_shcema = properties[key]
+
+ # If it's an array of objects, recurse inside each item.
+ if isinstance(value, list):
+ if not property_shcema['type'] == 'array':
+ raise ValueError(
+ f"Invalid JSON object: {key} is not a list.")
+
+ # We're only dealing with objects inside arrays.
+ if not property_shcema['items']['type'] == 'object':
+ continue
+
+ for item in value:
+ # Make sure each item is an object.
+ if not isinstance(item, dict):
+ raise ValueError(
+ f"Invalid JSON object: {key} is not a list of objects.")
+ remove_undefined_properties(item, property_shcema['items'])
+
+ # If it's an object, recurse inside it.
+ elif isinstance(value, dict):
+ # Make sure the object is supposed to be an object.
+ if not property_shcema['type'] == 'object':
+ raise ValueError(
+ f"Invalid JSON object: {key} is not an object.")
+
+ remove_undefined_properties(value, property_shcema)
+
+
+import pprint
+pprint.pprint(a_input)
+fill_nonexistent_properties(a_input, INPUT_SCHEME)
+remove_undefined_properties(a_input, INPUT_SCHEME)
+print("-"*10, "OUTPUT", "-"*10)
+pprint.pprint(a_input)
+
+Output:
+{'a1': {'a1_1': 'apple', 'a1_2': 20.5},
+ 'a5': [{'a5_1': 'pear'}, {'a5_2': 18.5}],
+ 'a6': [1, 2, 3, 4]}
+---------- OUTPUT ----------
+{'a1': {'a1_1': 'apple', 'a1_2': 20.5},
+ 'a2': [],
+ 'a3': None,
+ 'a4': {'a4_1': None, 'a4_2': {'a4_2_1': None, 'a4_2_2': None}},
+ 'a5': [{'a5_1': 'pear', 'a5_2': None}, {'a5_1': None, 'a5_2': 18.5}]}
+
",python
+"Stop auto rounding in numpyI have this code but in the result, the values are converted to integers. I want the real value. How? Thank you for your help.
+
+import numpy as np
+
+a=np.array([1,2,3,4])
+
+f=np.array([0.00020,0.0001])
+
+for i in range(2):
+ a[i]=f[i]
+print(a)
+# [0 0 3 4]
+
","a is an array of integers, you have to set it to a float array
+a=np.array([1,2,3,4], dtype=float)
+
",python
+"Pyinstaller --hidden-imports not wirkingI'm trying to build an exe file from a python game using PyBox2D and Pyglet.
+When I build the exe there occurs an error that a module cannot be imported:
+Unable to import the back-end pyglet: module 'gui.backends' has no attribute 'pyglet_framework'
+I guess this happens because this file only gets imported indirectly/hidden from another file with __import__()
+This is my project hierarchy:
+![]()
+I've tried to add the file to pyinstaller on multiple ways. From the directory /SpaceJam/building I've tried calling:
+pyinstaller.exe --onefile "../game/spacejam.py" --hidden-import="gui/framework/backends/pyglet_framework.py"
+pyinstaller.exe --onefile "../game/spacejam.py" --hidden-import="../gui/framework/backends/pyglet_framework.py"
+pyinstaller.exe --onefile "../game/spacejam.py" --hidden-import="../gui/framework/backends/pyglet_framework"
+pyinstaller.exe --onefile "../game/spacejam.py" --hidden-import="../gui/framework/backends/*"
+
+but none of that seemed to change anything in the error message.
+I feel like I'm missing something obvious. Does somebody have an idea what I might be doing wrong or why the --hidden-import argument doesn't seem to work?
","That argument expects a module name, not a filesystem path. How would you import it?
+I'm not totally clear on how your project is set up, but try
+--hidden-import gui.backends.pyglet_framework
+
",python
+"How to extract text from multiple pdf in a location with specific line and store in Excel?I have 100 pdf stored in a location and I want to extract text from them and store in excel
+below is pdf image
+in this i want (stored in page1)
+bid no,end date,item category,organisation name
+
+
+![]()
+needed
+OEM Average Turnover (Last 3 Years),Years of Past Experience required,MSE Exemption for Years Of Experience
+and Turnover,Startup Exemption for Years of Experience
+and Turnover,Estimated Bid Value,EMD Required
+
+![]()
+Consignee address only)
+![]()
","Tika is one of the Python packages that you can use to extract the data from your PDF files.
+In the example below I'm using Tika and regular expressions to extract these five data elements:
+
+- bid no
+- end date
+- item category
+- organisation name
+- total quantity
+
+import re as regex
+from tika import parser
+
+parse_entire_pdf = parser.from_file('2022251527199.pdf', xmlContent=True)
+for key, values in parse_entire_pdf.items():
+ if key == 'content':
+ bid_number = regex.search(r'(Bid Number:)\W(GEM\W\d{4}\W[A-Z]\W\d+)', values)
+ print(bid_number.group(2))
+ GEM/2022/B/1916455
+
+ bid_end_date = regex.search(r'(Bid End Date\WTime)\W(\d{2}-\d{2}-\d{4}\W\d{2}:\d{2}:\d{2})', values)
+ print(bid_end_date.group(2))
+ 21-02-2022 15:00:00
+
+ org_name = regex.search(r'(Organisation Name)\W(.*)', values)
+ print(org_name.group(2))
+ State Election Commission (sec), Gujarat
+
+ item_category = regex.search(r'(Item Category)\W(.*)', values)
+ print(item_category.group(2))
+ Desktop Computers (Q2) , Computer Printers (Q2)
+
+ total_quantity = regex.search(r'(Total Quantity)\W(\d+)', values)
+ print(total_quantity.group(2))
+ 18
+
+
+Here is one way to write out the extracted data to a CSV file:
+import csv
+import re as regex
+from tika import parser
+
+document_elements = []
+
+# processing 2 documents
+documents = ['202225114747453.pdf', '2022251527199.pdf']
+for doc in documents:
+ parse_entire_pdf = parser.from_file(doc, xmlContent=True)
+ for key, values in parse_entire_pdf.items():
+ if key == 'content':
+ bid_number = regex.search(r'(Bid Number:)\W(GEM\W\d{4}\W[A-Z]\W\d+)', values)
+
+ bid_end_date = regex.search(r'(Bid End Date\WTime)\W(\d{2}-\d{2}-\d{4}\W\d{2}:\d{2}:\d{2})', values)
+
+ org_name = regex.search(r'(Organisation Name)\W(.*)', values)
+
+ item_category = regex.search(r'(Item Category)\W(.*)', values)
+
+ total_quantity = regex.search(r'(Total Quantity)\W(\d+)', values)
+
+ document_elements.append([bid_number.group(2),
+ bid_end_date.group(2),
+ org_name.group(2),
+ item_category.group(2),
+ total_quantity.group(2)])
+
+
+with open("out.csv", "w", newline="") as f:
+ headerList = ['bid_number', 'bid_end_date', 'org_name', 'item_category', 'total_quantity']
+ writer = csv.writer(f)
+ writer.writerow(headerList)
+ writer.writerows(document_elements)
+
+![]()
+Here is the additional code that you asked for in the comments.
+import os
+import re as regex
+from tika import parser
+
+document_elements = []
+
+image_directory = "pdf_files"
+image_directory_abspath = os.path.abspath(image_directory)
+for dirpath, dirnames, filenames in os.walk(image_directory_abspath):
+ for filename in [f for f in filenames if f.endswith(".pdf")]:
+ parse_entire_pdf = parser.from_file(os.path.join(dirpath, filename), xmlContent=True)
+ for key, values in parse_entire_pdf.items():
+ if key == 'content':
+ bid_number = regex.search(r'(Bid Number:)\W(GEM\W\d{4}\W[A-Z]\W\d+)', values)
+
+ bid_end_date = regex.search(r'(Bid End Date\WTime)\W(\d{2}-\d{2}-\d{4}\W\d{2}:\d{2}:\d{2})', values)
+
+ org_name = regex.search(r'(Organisation Name)\W(.*)', values)
+
+ item_category = regex.search(r'(Item Category)\W(.*)', values)
+
+ total_quantity = regex.search(r'(Total Quantity)\W(\d+)', values)
+
+ document_elements.append([bid_number.group(2),
+ bid_end_date.group(2),
+ org_name.group(2),
+ item_category.group(2),
+ total_quantity.group(2)])
+
+with open("out.csv", "w", newline="") as f:
+ headerList = ['bid_number', 'bid_end_date', 'org_name', 'item_category', 'total_quantity']
+ writer = csv.writer(f)
+ writer.writerow(headerList)
+ writer.writerows(document_elements)
+
+
+SPECIAL NOTE: I noted that some PDFs don't have an org_name, so you will have to figure out how to handle these with either a N/A, None, or Null
",python
+"How to pull comments from individual Youtube videos instead of random videosI'm building a project using the Youtube API where I want to pull comments from each video in a specific youtuber's channel. However, when I pull the comments, it only pulls one comment from each video instead of pulling 100 comments from each video like I want it to.
+def get_video_comments(video_id):
+ url_video_comments ='https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&videoId='+video_id+'&maxResults=100&key='+API_KEY
+ response_video_comments = requests.get(url_video_comments).json()
+
+ for comment in response_video_comments['items']:
+ if comment['kind'] == 'youtube#commentThread':
+ comment_text = comment['snippet']['topLevelComment']['snippet']['textOriginal']
+ comment_likes = comment['snippet']['topLevelComment']['snippet']['likeCount']
+
+ return comment_text, comment_likes
+
+def get_videos(df,df_comments,pageToken):
+ url = 'https://www.googleapis.com/youtube/v3/search?key='+API_KEY+'&channelId='+channel_id+'&pageToken='+pageToken+'&part=snippet,id&order=date&maxResults=10000'
+
+ response = requests.get(url).json()
+ # wait 1 second before executing for loop in order to get all of the data
+ time.sleep(1)
+ # create loop to get information on every video
+ for video in response['items']:
+ # makes sure that we only get information on videos
+ if video['id']['kind'] == 'youtube#video':
+ video_id = video['id']['videoId']
+ video_title = video['snippet']['title']
+ upload_date = video['snippet']['publishedAt']
+ upload_date = str(upload_date).split('T')[0]
+
+ view_count, like_count, comment_count = get_video_details(video_id)
+ comment_text, comment_likes = get_video_comments(video_id)
+ # save data in pandas DF
+ df = df.append({'video_id':video_id,'video_title':video_title, 'upload_date':upload_date, 'view_count':view_count,
+ 'like_count':like_count,'comment_count':comment_count}
+ , ignore_index=True)
+ df_comments = df_comments.append({'video_id':video_id, 'comment_text':comment_text,
+ 'comment_likes':comment_likes},ignore_index=True)
+ return df, df_comments
+
+Below is the output.
+ video_id comment_text comment_likes
+0 0faCad2kKeg Basically, this video describes how you are ab... 0
+1 9dnN82DsQ2k thanks for using metric 0
+2 Y413Czri6qw "Another way to fight climate change is to eat... 0
+3 W3qZIPiWKc4 6:00 0
+4 ggUduBmvQ_4 should be illegal to pull loans for speculation 0
+5 xhxo2oXRiio Man, this is really getting to me :( 0
+6 8d5d_HXGeMA Not sure how you can fail to bring up powerful... 0
+7 8egszLpKMWU africa is not the next anything. 0
+8 WNrobOYWZQE For space exploration to happen, war needs to ... 0
+9 ZZ3F3zWiEmc Taxes themselves are Legalized theft. Moreover... 0
+10 V16GdzRvhRU Saudi Arabia and the other Islamic oil kingdom... 0
+11 o4tuhWvKduU This video makes getting out of Afganistan see... 0
+12 1-uNMj57Y4c Meanwhile, I am not in the air travel market a... 0
+13 SR7BA3xEmDo rather trains...... 0
+14 iO5mfbpq16A Because the Southern hemisphere is mostly wate... 0
+15 B3FKtBNEBRc The annoying playground cytomorphologically sm... 0
+16 VJtFgte1GKc 10:36 the ruthless pursuit of a fish in sea 0
+17 J5PLyYVIEpg "Indian nation" Indians are in India 1
+18 LHhJuAOK3CI This is really great, but I hope you do more p... 0
+19 aH4b3sAs-l8 What would happen if lightning hit it mid flight? 0
+20 b1JlYZQG3lI You actually can't be more wrong. We shut down... 1
+21 BNpk_OGEGlA We control the supply chain by how involved we... 0
+22 4p0fRlCHYyg Hey did you see the “new” united supersonic pl... 0
+23 N4dOCfWlgBw I hope the cruise industry dies 0
+24 3CuPqeIJr3U You westerners your days are up..Just get lost 0
+25 DlTq8DbRs4k Maggie's dead now :) 0
+26 VjiH3mpxyrQ What you’ve missed about ‘Crisis’ stress testi... 0
+27 pLcqJ2DclEg If I were to buy a Tesla Today, I would also h... 0
+28 3gdCH1XUIlE I’m glad I live in a civilized country where t... 0
+29 2qanMpnYsjk I just had this recommended to me and I can al... 0
+30 7R7jNWHp0D0 Another unequal biased Biden loving Trump hat... 0
+31 GIFV_Z7Y9_w Hello from Kazakhstan 0
+32 KXRtNwUju5g This is sexist, why was it women that throws t... 0
+33 fTyUE162lrw What's the point of these subscription based l... 0
+34 ZAEydOjNWyQ So if Covid is stoping it, just get everybody ... 0
+35 _BCY0SPOFpE Great observation 0
+36 v_rXhuaI0W8 13:50 Such a lie. 1) The two have nothing to d... 0
+37 Ongqf93rAcM This works but it’s still gambling so don’t ex... 0
+38 byW1GExQB84 Can we have an update?\nAstra Zeneca seems rea... 1
+39 DTIDCA7mjZs My dad be like: Gracias por las instrucciones ... 1
+40 H_akzwzghWQ Why are you showing us graphs of "percent devi... 0
+41 7C1fPocIFgU There is a (partial) solution in the middle. C... 0
+42 3J06af5xHD0 Her speech bought me to tears. 0
+43 YgiMqePRp0Y There is a problem with pooled testing. The th... 0
+44 Rtmhv5qEBg0 Great work good game 0
+45 6GMoUmvw8kU 9:48 yo, wtf is wrong with that man`s face? 0
+46 QlPrAKtegFQ Follow along as a west bound autonomous 80,000... 0
+47 uAG4zCsiA_w development of tiltrotor aircraft has almost s... 1
+48 NtX-Ibi21tU Wait there is a risk of 1:55 000 by walking ou... 0
+49 r2oPk20OHBE 2:41 The street where I live was the last thin... 0
+
+The video_id matches the video that I want to pull the comments from. I'm just struggling to understand how I can pull more than one comment. Any help would be greatly appreciated.
","It seems that your get_video_comments function is quite well designed with the for loop for comment in response_video_comments['items']: however at each iteration you overwrite the previous comment that was in comment_text your get_video_comments function should instead returns an array of comment_text and comment_likes and at each iteration you should happend to this array the current working comment.
+The algorithm to grab all comments from a video is very common, deepen your searches if you are still stuck.
",python
+"Gstreamer how to get element information from STATE_CHANGED message from the busI am working with Gstreamer and its Python bindings. Consider the following bus_call function:
+import sys
+from gi.repository import Gst
+
+
+def bus_call(bus, message, loop):
+ t = message.type
+ if t == Gst.MessageType.EOS:
+ print("Bus call: End-of-stream\n")
+ # loop.quit()
+ elif t == Gst.MessageType.WARNING:
+ err, debug = message.parse_warning()
+ sys.stderr.write("Bus call: Warning: %s: %s\n" % (err, debug))
+ elif t == Gst.MessageType.ERROR:
+ err, debug = message.parse_error()
+ sys.stderr.write("Bus call: Error: %s: %s\n" % (err, debug))
+ # loop.quit()
+ elif t == Gst.MessageType.BUFFERING:
+ print("Bus call: Buffering\n")
+ elif t == Gst.MessageType.STATE_CHANGED:
+ old_state, new_state, pending_state = message.parse_state_changed()
+ print((
+ f"Bus call: Pipeline state changed from {old_state.value_nick} to {new_state.value_nick} "
+ f"(pending {pending_state.value_nick})"
+ ))
+ else:
+ print(f"Bus call: {message}\n")
+ return True
+
+When the message type is Gst.MessageType.STATE_CHANGED, how can I retrieve the element that has changed the state?
","Not sure I correctly understand, but you may try something like:
+old_state, new_state, pending_state = message.parse_state_changed()
+print('%s : State changed from %s to %s' % (message.src.get_name(), old_state, new_state))
+
",python
+"Multiply two Dataframes if the column names and the condition on a column value matchesI have two dataframes(namely a and b):
+a =
+![]()
+and b =
+![]()
+And I am looking for a way to multiply the values in each column of Dataframe a with the corresponding column value of Dataframe b if the dates are matching. For example:values in column aa_10 of Dataframe a for the date 2021-01-19 with the corresponding value in column aa_10 of Dataframe b for the date 2021_01_19 (i.e., the column- NaN, 2.0, 3.0,4.0 of Dataframe a to be multiplied with the value 4 in Dataframe b) and so on.
+Desired result:
+![]()
+Sample Data:
+import numpy as np
+import pandas as pd
+
+d = {'Date' : pd.Series([20210119, 20210119, 20210119, 20210119, 20210122, 20210122, 20210122, 20210122]),
+ 'To' : pd.Series(['aa', 'bb', 'cc', 'dd', 'aa', 'bb', 'cc', 'dd']),
+ 'aa_10' : pd.Series([np.nan, 2, 3, 4, np.nan, 4, 3, 2]),
+ 'bb_11' : pd.Series([6, np.nan, 8, 9, 9, np.nan, 7, 8]),
+ 'cc_12' : pd.Series([1, 2, np.nan, 4, 4, 3, np.nan, 5]),
+ 'dd_13' : pd.Series([6, 7, 8, np.nan, 8, 6, 9, np.nan])}
+
+# creates Dataframe.
+a = pd.DataFrame(d)
+a['Date'] = pd.to_datetime(a['Date'], format='%Y%m%d')
+# print the data.
+display (a)
+
+# Initialize data to Dicts of series.
+d = {'Date' : pd.Series([20190110, 20210119, 20210121, 20210122]),
+ 'aa_10' : pd.Series([2, 4, 1, 2]),
+ 'bb_11' : pd.Series([1, 3, 5, 4]),
+ 'cc_12' : pd.Series([10, 12, 4, 2]),
+ 'dd_13' : pd.Series([2, 1, 2, 5])}
+
+# creates Dataframe.
+b = pd.DataFrame(d)
+b['Date'] = pd.to_datetime(b['Date'], format='%Y%m%d')
+# print the data.
+display(b)
+
","Set_index as date and multiply
+s=a.set_index(['Date', 'To']).mul(b.set_index(['Date'])).reset_index()
+
+
+
+
+ Date To aa_10 bb_11 cc_12 dd_13
+0 2021-01-19 aa NaN 18.0 12.0 6.0
+1 2021-01-19 bb 8.0 NaN 24.0 7.0
+2 2021-01-19 cc 12.0 24.0 NaN 8.0
+3 2021-01-19 dd 16.0 27.0 48.0 NaN
+4 2021-01-22 aa NaN 36.0 8.0 40.0
+5 2021-01-22 bb 8.0 NaN 6.0 30.0
+6 2021-01-22 cc 6.0 28.0 NaN 45.0
+7 2021-01-22 dd 4.0 32.0 10.0 NaN
+
",python
+"iterating through pandas dataframe and create new columns using custom functionI have a pandas dataframe which (obviously) contain some data. I have created a function that outputs a number new columns. How can I iterate or apply that function?
+I have created a minimum example below ( not the actual problem), with a dataframe and function.
+EDIT: Think of the function as a "black box". We don't now what is in, but based on the input it returns a dataframe, that should be added to the existing dataframe.
+import pandas as pd
+a=pd.DataFrame({"input1": ["a","b"], "input2":[3,2]})
+
+ input1 input2
+0 a 3
+1 b 2
+
+def f(i1,i2):
+ return(pd.DataFrame([{"repeat" : [i1]*i2, "square":i2**2 }]))
+
+
+So in this case the function returns two new columns "repeat" and "square"
+f(a.iloc[0,0],a.iloc[0,1])
+
+ repeat square
+0 [a, a, a] 9
+
+f(a.iloc[1,0],a.iloc[1,1])
+ repeat square
+0 [b, b] 4
+
+What I would like to end up with a data frame like this
+ input1 input2 repeat square
+0 a 3 [a, a, a] 9
+1 b 2 [b, b] 4
+
+Does anyone have an elegant solution to this?
","How about using pd.concat?
+generated_df = pd.concat([f(*args) for args in a.to_numpy()], ignore_index=True)
+out = pd.concat([a, generated_df], axis=1)
+
+Output:
+>>> out
+ input1 input2 repeat square
+0 a 3 [a, a, a] 9
+1 b 2 [b, b] 4
+
",python
+"How to submit data using HTML in the frontend and get the results from FastAPI backend?I am using FastAPI for the backend and HTML/CSS for the frontend. I want to click the button and get the wanted value in return (mt means when I put node1 and node2 it gets me its prediction from the JSON array.
+This is my code:
+This is the data list
+prediction_data = [
+ { "node1": 0, "node2": 1, "pred": 0},
+ { "node1": 0, "node2": 476, "pred":0.352956 },
+ { "node1": 0, "node2": 494, "pred":0.769988 },
+ { "node1": 1, "node2": 505, "pred":0.463901 },
+ { "node1": 9, "node2": 68 , "pred":1.238807},
+ { "node1": 15, "node2": 408, "pred":0.204171 },
+ { "node1": 18, "node2":549 , "pred":0.204171 },
+ { "node1": 60, "node2": 227, "pred":0.204171 },
+ { "node1": 199, "node2": 220, "pred":0.245246 },
+ { "node1": 170, "node2": 570, "pred":0.509272 },
+ { "node1": 148, "node2": 570, "pred":0.204171 },
+ { "node1": 151, "node2": 384, "pred":0.204114 },
+ { "node1": 232, "node2": 337, "pred":0.285999 },
+ { "node1": 446, "node2": 509, "pred":0.291206 },
+ { "node1": 510, "node2":576 , "pred":0.495378 },
+ { "node1": 571, "node2":589 , "pred":0 },
+ { "node1": 585, "node2":596 , "pred":0.245243 },
+ { "node1": 446, "node2":509 , "pred":0.291206 },
+ { "node1": 375, "node2":383 , "pred":0.46390 },
+ { "node1": 461, "node2":462 , "pred":0 }
+ ]
+
+This the getter function of the wanted value
+# Prediction
+@app.get("/prediction/{node1,node2}", response_class=HTMLResponse)
+async def gets(request: Request, node1: int, node2: int):
+ matching = list(filter(lambda x: x['node1'] == node1 and x['node2'] == node2, prediction_data))
+ mt = matching[0]['pred'] if matching else None
+ return templates.TemplateResponse("Interface.html", {"request": request, "mt": mt})
+
+This is the interface
+
+<!DOCTYPE html>
+<html lang="en">
+<head>
+ <meta charset="UTF-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+ <title>Link Prediction</title>
+ <link href="{{ url_for('static', path='/style.css') }}" rel="stylesheet">
+</head>
+
+<body>
+ <div class="container" >
+ <h1>Link Prediction </h1>
+ <h2>In Social Network</h2>
+ <form class="form" action="#">
+ <fieldset class="form-fieldset ui-input __first">
+ <input type="number" id="Node1" tabindex="0" /> {{node1}}
+ <label for="Node1">
+ <span data-text="Node 1">Node 1</span>
+ </label>
+ </fieldset>
+
+ <fieldset class="form-fieldset ui-input __second">
+ <input type="number" id="Node2" tabindex="0" /> {{node2}}
+ <label for="Node2">
+ <span data-text="Node 2">Node 2 </span>
+ </label>
+ </fieldset>
+
+ <div class="form-footer">
+ <button onclick="myfunctionName(Node1, Node2)" class="btn">Predict Now</button>
+ </div>
+
+ <script type="text/javascript">
+
+ function myfunctionName( n1,n2 ){
+
+ document.getElementById("Node1").innerHTML += n1;
+ document.getElementById("Node2").innerHTML += n2;
+ document.getElementById("Prediction") = mt;
+ }
+
+ </script>
+
+<fieldset class="form-fieldset ui-input __third">
+ <input type="text" id="Prediction" readonly/> {{mt}}
+ <label for="Prediction">
+ <span data-text="Prediction Result" >Prediction Result</span>
+ </label>
+ </fieldset>
+
+ </form>
+ </div>
+</body>
+</html>
+
+
+
","To start with, you should have a look at how Path parameters work in FastAPI. The following in your code @app.get("/prediction/{node1,node2}" is not a valid route for passing path parameters. It will instead be recognised as a single path name. You can verify this by opening Swagger UI
+at http://127.0.0.1:8000/docs, which shows that the endpoint expects node1 and node2 to be query params, not path params. When submitting form data, one should use Form for that purpose, and POST HTTP method (your endpoint should be using @app.post()). Please take the time to read the documentation, as well as have a look at this answer. If you still, however, need this to be a GET request, you can use a GET operation @app.get() in your API, as well as define the method for your HTML form being method="GET". Below is an example demonstrating this.
+In your API (access the main page at http://127.0.0.1:8000):
+@app.get("/", response_class=HTMLResponse)
+def main(request: Request):
+ return templates.TemplateResponse("predict.html", {"request": request})
+
+@app.get("/predict", response_class=HTMLResponse)
+def predict(request: Request, node1: int, node2: int):
+ matching = list(filter(lambda x: x['node1'] == node1 and x['node2'] == node2, prediction_data))
+ mt = matching[0]['pred'] if matching else None
+ return templates.TemplateResponse("predict.html", {"request": request, "node1": node1, "node2": node2, "mt": mt})
+
+In your template (i.e., predict.html in the case above)
+<!DOCTYPE html>
+<html lang="en">
+ <body>
+ <h1>Link Prediction </h1>
+ <h2>In Social Network</h2>
+ <form method="GET" action="/predict" id="myForm">
+ node1 : <input type="text" name="node1" value={{node1}}> <br>
+ node2 : <input type="text" name="node2" value={{node2}}><br>
+ <input type="submit" value="Predict" >
+ </form>
+ Prediction Result: <b>{{mt}}</b>
+ </body>
+</html>
+
",python
+"python - calculate a new column in pandas using 2 numpy arraysI have a pandas dataframe :
+column header is called "Location"
+example contents:
+"London Arndale Centre"
+"Manchester Arndale"
+"Birmingham Central Station"
+"Newcastle Metro Centre"
+2 numpy arrays :
+originalLocation = np.array(["London Arndale Centre","Manchester Arndale","Birmingham Central Station","Newcastle Metro Centre")
+
+newLocation = np.array(["London","Manchester","Birmingham","Newcastle"]
+
+i want to create a new column in the pandas : newLocation
+the result needs to be the matching column in newLocation, where the location field matches the original location numpy.
+example : "London Arndale Centre" needs to be "London"
+"Manchester Arndale" needs to be "Manchester"
+i have tried this , but it throw back errors
+df['newLocation'] = newLocation[int(np.where(originalLocation == df['Location'])[0])]
+
+errors : ValueError: ('Lengths must match to compare', (159,), (12,))
+what am i doing wrong here ?
","It seems like you forgot the commas in your originalLocation array. Also, the int() is not necessary. Updated code:
+df_data = ["London Arndale Centre", "Manchester Arndale", "Birmingham Central Station", "Newcastle Metro Centre"]
+df = pd.DataFrame(df_data, columns=['Location'])
+
+originalLocation = np.array(["London Arndale Centre", "Manchester Arndale", "Birmingham Central Station", "Newcastle Metro Centre"])
+
+newLocation = np.array(["London","Manchester","Birmingham","Newcastle"])
+
+df['newLocation'] = newLocation[np.where(originalLocation == df['Location'])[0]]
+
+df
+
+Output:
+ Location newLocation
+0 London Arndale Centre London
+1 Manchester Arndale Manchester
+2 Birmingham Central Station Birmingham
+3 Newcastle Metro Centre Newcastle
+
+EDIT: As you mentioned merge works even if not all values are included in the new locations. I create a small example using merge:
+df_data = ["London Arndale Centre", "Manchester Arndale", "Birmingham Central Station", "Newcastle Metro Centre"]
+df = pd.DataFrame(df_data, columns=['Location'])
+
+originalLocation = ["London Arndale Centre", "Birmingham Central Station", "Newcastle Metro Centre"]
+newLocation = ["London", "Birmingham", "Newcastle"]
+
+df_new = pd.DataFrame({'Location': originalLocation,
+ 'newLocation': newLocation})
+
+df.merge(df_new, on='Location', how='left')
+
+Output with Manchester entry missing:
+Location newLocation
+0 London Arndale Centre London
+1 Manchester Arndale NaN
+2 Birmingham Central Station Birmingham
+3 Newcastle Metro Centre
+
",python
+"get count of all positive values at each row - pandasI have dataset with values above and below 0 and I want to calculate count of +ve and -ve with sum and count of values above a threshold added as new column. This dataset has 60 columns.
+Dataset
+A B C D E
+0 foo 1.2 -1 2
+1 bar 1.3 -2 -4
+2 baz 2.1 2 5
+
+The desired result,
+A B C D E postive_count negative_count sum_pos sum_neg above_2
+0 foo 1.2 -1 2 2 1 3.2 -1 1
+1 bar 1.3 -2 -4 1 2 1.3 -6 0
+2 baz 2.1 2 5 3 3 9.1 0 3
+
+I have tried [1] 2 but these add it for columns and not full row. Any ideas would be appreciated!
","You need to use axis to apply the operation row-wise:
+df['positive_count'] = df[df>0].count(axis=1)
",python
+"OnnxRuntime vs OnnxRuntime+OpenVinoEP inference time differenceI'm trying to accelerate my model's performance by converting it to OnnxRuntime. However, I'm getting weird results, when trying to measure inference time.
+While running only 1 iteration OnnxRuntime's CPUExecutionProvider greatly outperforms OpenVINOExecutionProvider:
+
+- CPUExecutionProvider - 0.72 seconds
+- OpenVINOExecutionProvider - 4.47 seconds
+
+But if I run let's say 5 iterations the result is different:
+
+- CPUExecutionProvider - 3.83 seconds
+- OpenVINOExecutionProvider - 14.13 seconds
+
+And if I run 100 iterations, the result is drastically different:
+
+- CPUExecutionProvider - 74.19 seconds
+- OpenVINOExecutionProvider - 46.96seconds
+
+It seems to me, that the inference time of OpenVinoEP is not linear, but I don't understand why.
+So my questions are:
+
+- Why does OpenVINOExecutionProvider behave this way?
+- What ExecutionProvider should I use?
+
+The code is very basic:
+import onnxruntime as rt
+import numpy as np
+import time
+from tqdm import tqdm
+
+limit = 5
+# MODEL
+device = 'CPU_FP32'
+model_file_path = 'road.onnx'
+
+image = np.random.rand(1, 3, 512, 512).astype(np.float32)
+
+# OnnxRuntime
+sess = rt.InferenceSession(model_file_path, providers=['CPUExecutionProvider'], provider_options=[{'device_type' : device}])
+input_name = sess.get_inputs()[0].name
+
+start = time.time()
+for i in tqdm(range(limit)):
+ out = sess.run(None, {input_name: image})
+end = time.time()
+inference_time = end - start
+print(inference_time)
+
+# OnnxRuntime + OpenVinoEP
+sess = rt.InferenceSession(model_file_path, providers=['OpenVINOExecutionProvider'], provider_options=[{'device_type' : device}])
+input_name = sess.get_inputs()[0].name
+
+start = time.time()
+for i in tqdm(range(limit)):
+ out = sess.run(None, {input_name: image})
+end = time.time()
+inference_time = end - start
+print(inference_time)
+
","The use of ONNX Runtime with OpenVINO Execution Provider enables the inferencing of ONNX models using ONNX Runtime API while the OpenVINO toolkit runs in the backend.
+This accelerates ONNX model's performance on the same hardware compared to generic acceleration on Intel® CPU, GPU, VPU and FPGA.
+Generally, CPU Execution Provider works best with small iteration since its intention is to keep the binary size small. Meanwhile, the OpenVINO Execution Provider is intended for Deep Learning inference on Intel CPUs, Intel integrated GPUs, and Intel® MovidiusTM Vision Processing Units (VPUs).
+This is why the OpenVINO Execution Provider outperforms the CPU Execution Provider during larger iterations.
+You should choose Execution Provider that would suffice your own requirements. If you going to execute complex DL with large iteration, then go for OpenVINO Execution Provider. For a simpler use case, where you need the binary size to be smaller with smaller iterations, you can choose the CPU Execution Provider instead.
+For more information, you may refer to this ONNX Runtime Performance Tuning
",python
+"How to pass a parameter into bigquery query on colabI have a Bigquery query on colab:
+from google.colab import auth
+auth.authenticate_user()
+print('Authenticated')
+project_id = '[your project ID]'
+
+sample_count = 2000
+df = pd.io.gbq.read_gbq('''
+ SELECT name, SUM(number) as count
+ FROM `bigquery-public-data.usa_names.usa_1910_2013`
+ WHERE state = 'TX'
+ AND year BETWEEN 1910 AND 1920
+ GROUP BY name
+ ORDER BY count DESC
+ LIMIT 100
+''', project_id=project_id, dialect='standard')
+
+df.head()
+
+It works, but now I try to pass a parameter into the query and replace '1920' in the query WHERE clause. this parameter is dependent on another file
+end_year = max(record.year) # set end_year
+
+df = pd.io.gbq.read_gbq('''
+ SELECT name, SUM(number) as count
+ FROM `bigquery-public-data.usa_names.usa_1910_2013`
+ WHERE state = 'TX'
+ AND year BETWEEN 1910 AND end_year
+ GROUP BY name
+ ORDER BY count DESC
+ LIMIT 100
+''', project_id=project_id, dialect='standard')
+
+df.head()
+
+But I get an error:
+BadRequest: 400 Syntax error: Unexpected identifier "end_year"
+
+I guess the parameter doesn't pass into the query successfully, but I don't know how to fix it.
","As @Mike Karp mentioned, the query in your code is a String that is why you are encountering an error whenever you are passing your variable directly to the query.
+You may also use python's f string to format your string and be able to pass the variable inside your query.
+from google.colab import auth
+import pandas as pd
+auth.authenticate_user()
+print('Authenticated')
+project_id = 'PROJECT_ID'
+
+end_year = max(record.year) # set end_year
+
+query = (f" SELECT name, SUM(number) as count \
+ FROM `bigquery-public-data.usa_names.usa_1910_2013` \
+ WHERE state = 'TX' \
+ AND year BETWEEN 1910 AND {end_year} \
+ GROUP BY name \
+ ORDER BY count DESC \
+ LIMIT 100")
+
+df = pd.io.gbq.read_gbq(query=query, project_id=project_id, dialect='standard')
+
+df.head()
+
",python
+"Run out of memory trying to create a tensor of size [2191, 512] with pytorch to save data from movie frames using CLIPI'm using pytorch for the first time and I'm facing a problem I don't think I should. I currently have selected 2919 frames of a movie in jpg. I'm trying to transform all those images into a single tensor. I'm using CLIP to transform each image into a tensor of size [1, 512]. In the end, I expected to have a tensor of size [2919, 512], which should not use that much memory. But my code never finishes running and I can only assume I'm doing something terribly wrong.
+First I'm doing my import and loading the model:
+import torch
+import clip
+from glob import glob
+from PIL import Image
+
+device = "cuda" if torch.cuda.is_available() else "cpu"
+model, preprocess = clip.load("ViT-B/32", device=device)
+
+Secondly, I'm reading the path for all the images and initializing the "film" tensor with random value to overwrite them. I tried generating an empty one and concatenate but that also consumed too much memory:
+path_names = glob(r"Films/**/*.jpg")
+film = torch.rand((len(files), 512), dtype=torch.float32, device = device)
+film_frame_count = 0
+ for file in files:
+ print("Frame " + str(film_frame_count) + " out of " + str(len(files)))
+ film[film_frame_count] = model.encode_image(preprocess(Image.open(file)).unsqueeze(0).to(device))[0]
+ film_frame_count += 1
+ torch.save(film, 'output_tensor/'+ film_code[1])
+
+If anyone could point it out what I'm doing wrong I would appreciate.
","The problem ended up being caused because pytorch was saving the gradiants for the graph, so I needed to indicate that I didn't want them to be stored with this indicator on top:
+with torch.no_grad():
+ /*my code*/
+
",python
+"python regex to read text file and split row to columnMy text file format is like below
+ID col_A col_B col_C
+1 0.26 0.11 0.18
+2 0.27 0.12 0.17
+3 0.21 0.10 0.15
+----------------------------
+AVG 0.25 0.11 0.17
+----------------------------
+ID col_D col_E col_F
+1 0.23 0.18 0.20
+2 0.24 0.14 0.17
+3 0.23 0.10 0.13
+----------------------------
+AVG 0.23 0.14 0.17
+----------------------------
+
+I'm attempting to use python and regex to export two separate csv files with the format like below
+Table 1
+
+
+
+
+| ID |
+col_A |
+col_B |
+col_C |
+col_D |
+col_E |
+col_F |
+
+
+
+
+| 1 |
+0.26 |
+0.11 |
+0.18 |
+0.23 |
+0.18 |
+0.20 |
+
+
+| 2 |
+0.27 |
+0.12 |
+0.17 |
+0.24 |
+0.14 |
+0.17 |
+
+
+| 3 |
+0.21 |
+0.10 |
+0.15 |
+0.23 |
+0.10 |
+0.13 |
+
+
+
+
+Table 2
+
+
+
+
+ |
+col_A |
+col_B |
+col_C |
+col_D |
+col_E |
+col_F |
+
+
+
+
+| AVG |
+0.25 |
+0.11 |
+0.17 |
+0.23 |
+0.14 |
+0.17 |
+
+
+
+
+Here's my code:
+import re
+import pandas as pd
+
+
+with open('test.txt') as file:
+ lines = file.readlines()
+ regex = r'\A(?P<ID>\S+)\s*(?P<COL_A>\S+)\s*(?P<COL_B>\S+)\s*(?P<COL_C>\S+)'
+ data = []
+
+ for line in lines:
+ m = re.search(regex, line)
+ if m != None:
+ data.append([m.group(1),m.group(2),m.group(3),m.group(4)])
+
+ df = pd.DataFrame(data)
+
+df.to_csv('test.csv', index = False)
+
+My code would result in a strange format like
+
+
+
+
+| 0 |
+1 |
+2 |
+3 |
+
+
+
+
+| ID |
+col_A |
+col_B |
+col_C |
+
+
+| 1 |
+0.26 |
+0.11 |
+0.18 |
+
+
+| 2 |
+0.27 |
+0.12 |
+0.17 |
+
+
+| 3 |
+0.21 |
+0.10 |
+0.15 |
+
+
+| ------ |
+--------- |
+--------- |
+--------- |
+
+
+| AVG |
+0.25 |
+0.11 |
+0.17 |
+
+
+| ------ |
+--------- |
+--------- |
+--------- |
+
+
+| ID |
+col_D |
+col_E |
+col_F |
+
+
+| 1 |
+0.23 |
+0.18 |
+0.20 |
+
+
+| 2 |
+0.24 |
+0.14 |
+0.17 |
+
+
+| 3 |
+0.23 |
+0.10 |
+0.13 |
+
+
+| ------ |
+--------- |
+--------- |
+--------- |
+
+
+| AVG |
+0.23 |
+0.14 |
+0.17 |
+
+
+| ------ |
+--------- |
+--------- |
+--------- |
+
+
+
+
+How can I modify my code to achieve my request? Thank you!
","You can use
+import re, itertools
+import pandas as pd
+
+data = []
+colnames = []
+avg = []
+start = True
+is_next_avg = False
+id_start_label = ''
+
+with open('test.txt', 'r') as file:
+ for line in file:
+ line = line.strip()
+ if start and line.replace('-', '').strip():
+ id_start_label = line.split()[0]
+ colnames.extend(line.split()[1:])
+ start = False
+ elif '---' in line and not line.replace('-', '').strip():
+ if start:
+ continue
+ is_next_avg = not is_next_avg
+ elif is_next_avg:
+ avg.extend(line.split()[1:])
+ is_next_avg = not is_next_avg
+ start = not start
+ else:
+ data.append(line.split())
+
+colnames.insert(0, id_start_label)
+data = sorted(data, key=lambda x: x[0])
+new_data = []
+for key, g in itertools.groupby(data, key=lambda x: x[0]):
+ entry = [key]
+ entry.extend( [i for x in g for i in x[1:]] )
+ new_data.append(entry)
+
+df = pd.DataFrame(new_data, columns=colnames)
+df2 = pd.DataFrame(avg, columns=['AVG'])
+df2 = df2.transpose()
+df2.columns = df.columns[1:]
+
+Output:
+>>> df
+ ID col_A col_B col_C col_D col_E col_F
+0 1 0.26 0.11 0.18 0.23 0.18 0.20
+1 2 0.27 0.12 0.17 0.24 0.14 0.17
+2 3 0.21 0.10 0.15 0.23 0.10 0.13
+
+>>> df2
+ col_A col_B col_C col_D col_E col_F
+AVG 0.25 0.11 0.17 0.23 0.14 0.17
+
",python
+"Selenium: ""Unable to find session with ID"" after a few minutes of idlingI started a Docker container with:
+docker run -d --shm-size="4g" --hostname selenium_firefox selenium/standalone-firefox
+In another container with Python:
+...
+>>> driver = webdriver.Remote(command_executor="http://" +selenium_host+":4444/w
+d/hub", desired_capabilities=DesiredCapabilities.FIREFOX, keep_alive=True)
+
+>>> driver.title
+''
+>>> driver.title
+Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/webdri
+ver.py", line 447, in title
+ resp = self.execute(Command.GET_TITLE)
+ File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/webdri
+ver.py", line 424, in execute
+ self.error_handler.check_response(response)
+ File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/errorh
+andler.py", line 247, in check_response
+ raise exception_class(message, screen, stacktrace)
+selenium.common.exceptions.WebDriverException: Message: Unable to execute reques
+t for an existing session: Unable to find session with ID: 5c619451-8361-4ec9-9b
+7e-58b7afac15ff
+Build info: version: '4.1.1', revision: 'e8fcc2cecf'
+System info: host: 'selenium_firefox', ip: '172.17.0.3', os.name: 'Linux', os.ar
+ch: 'amd64', os.version: '5.4.0-89-generic', java.version: '11.0.13'
+Driver info: driver.version: unknown
+
+The first driver.title I ran it immediately after creating the remote webdriver.
+Then I waited for some time (around 15 minutes) and ran driver.title again, and it seem that the Python console has lost connection to the corresponding browser.
+Why does this happen and how do I avoid it? It doesn't happen if I don't use a remote webdriver.
","Option 1: Override Docker Selenium Grid default session timeout
+From docker/selenium documentation:
+
+Grid has a default session timeout of 300 seconds, where the session can be on a stale state until it is killed. You can use SE_NODE_SESSION_TIMEOUT to overwrite that value in seconds.
+
+docker run -d -e SE_NODE_SESSION_TIMEOUT=1000 --shm-size="4g" --hostname selenium_firefox selenium/standalone-firefox
+
+Option 2: Ping your session once in 60 (any < 300) seconds
+You may execute some driver command in a loop during the idle time
+for x in range(15):
+ time.sleep(60)
+ driver.current_url
+
+
+Reference
+https://github.com/SeleniumHQ/docker-selenium#grid-url-and-session-timeout
",python
+"ImportError: cannot import name '_obtain_input_shape' in kerasWhen I try to import keras_squeezenet I get this error:
+Traceback (most recent call last):
+ File "C:/Users/belog/drone_sees/train_model.py", line 3, in <module>
+ from keras_squeezenet import SqueezeNet
+ File "C:\Users\belog\AppData\Local\Programs\Python\Python36\lib\site-packages\keras_squeezenet\__init__.py", line 1, in <module>
+ from keras_squeezenet.squeezenet import SqueezeNet
+ File "C:\Users\belog\AppData\Local\Programs\Python\Python36\lib\site-packages\keras_squeezenet\squeezenet.py", line 1, in <module>
+ from keras.applications.imagenet_utils import _obtain_input_shape
+ImportError: cannot import name '_obtain_input_shape'
+
+Here is the import code:
+import tensorflow as tf
+from keras_squeezenet import SqueezeNet
+from keras.optimizers import Adam
+from keras.utils import np_utils
+from keras.layers import Activation, Dropout, Convolution2D, GlobalAveragePooling2D
+from keras.models import Sequential
+
+How to fix it? (I'm using tensorflow==2.6.2, keras==2.6.0, keras-squeezenet==0.4).
","Did you tried the new version ? (see : https://github.com/rcmalli/keras-squeezenet)
+you can install it with :
+pip install git+https://github.com/rcmalli/keras-squeezenet.git
",python
+"how to check if value in a DataFrame is a type DecimalI am writing a data test for some api calls that return a DataFrame with a date type and a type Decimal. I can't find a way to verify the Decimal
+the DataFrame is returned as 2022-01-18 12:35:00 2856.8430
+So I have
+result = ds.DataService.get_recent(5, 24)
+assert not result.empty
+assert ptypes.is_datetime64_any_dtype(result['foo'])
+
+but if I try
+assert all(ptypes.is_float_dtype(result[col]) for col in result["foo1"])
+
+ raise KeyError(key) from err
+KeyError: Decimal('2873.6408')
+
","So given a series like result["foo1"], you can check that with
+from decimal import Decimal
+
+import pandas as pd
+
+is_decimal: bool = pd.core.dtypes.common.is_dtype_equal(result["foo1"], Decimal)
+
",python
+"how does one match EOL (newline) with lark?I'm using the lark parser with python. I'd like to use EOL as part of the grammar since it is line oriented. I'm getting an error when I try to put the regex in for matching EOL. I see some examples like this:
+CR : /\r/
+LF : /\n/
+NEWLINE: (CR? LF)+
+
+but they don't work for me. this is my code:
+import sys
+import lark
+
+class Parser:
+ grammar = '''
+ start : STRING EOL
+ STRING : /\w+/
+ EOL : /\n/x
+ '''
+
+ parser = lark.Lark(grammar)
+
+ def __init__(self, fname):
+ self.fname = fname
+ self.ast(open(fname))
+
+ def ast(self, fh):
+ tree = self.parser.parse(fh.read())
+ print(tree.pretty())
+
+def main():
+ x = Parser(sys.argv[1])
+
+main()
+
+and here's the error I get:
+Traceback (most recent call last):
+ File "./p2.py", line 6, in <module>
+ class Parser:
+ File "./p2.py", line 13, in Parser
+ parser = lark.Lark(grammar)
+ File "/grid/common/pkgs/python/v3.7.2/lib/python3.7/site-packages/lark/lark.py", line 413, in __init__
+ self.parser = self._build_parser()
+ File "/grid/common/pkgs/python/v3.7.2/lib/python3.7/site-packages/lark/lark.py", line 456, in _build_parser
+ return parser_class(self.lexer_conf, parser_conf, options=self.options)
+ File "/grid/common/pkgs/python/v3.7.2/lib/python3.7/site-packages/lark/parser_frontends.py", line 242, in __call__
+ return ParsingFrontend(lexer_conf, parser_conf, options)
+ File "/grid/common/pkgs/python/v3.7.2/lib/python3.7/site-packages/lark/parser_frontends.py", line 61, in __init__
+ self.parser = create_parser(lexer_conf, parser_conf, options)
+ File "/grid/common/pkgs/python/v3.7.2/lib/python3.7/site-packages/lark/parser_frontends.py", line 209, in create_earley_parser
+ return f(lexer_conf, parser_conf, options, resolve_ambiguity=resolve_ambiguity, debug=debug, tree_class=tree_class, **extra)
+ File "/grid/common/pkgs/python/v3.7.2/lib/python3.7/site-packages/lark/parser_frontends.py", line 186, in create_earley_parser__dynamic
+ earley_matcher = EarleyRegexpMatcher(lexer_conf)
+ File "/grid/common/pkgs/python/v3.7.2/lib/python3.7/site-packages/lark/parser_frontends.py", line 172, in __init__
+ raise GrammarError("Dynamic Earley doesn't allow zero-width regexps", t)
+lark.exceptions.GrammarError: ("Dynamic Earley doesn't allow zero-width regexps", TerminalDef('EOL', '(?x:\n)'))
+
","I forgot about needing to use raw strings. had to add an "r". the new code looks like:
+class Parser:
+ grammar = r'''
+ start : STRING EOL
+ STRING : /\w+/
+ EOL : /\n/
+ '''
+
",python
+"python: How to read file independent from entry point (both notebook and .py files)I'm working on a python project that looks like this
+app
+├── data
+│ └── my_data.csv
+├── utils
+│ └── reader.py
+├── scripts
+│ └── s0.py
+├── notebooks
+│ └── n0.ipynb
+└── main.py
+
+
+In reader.py:
+def read_file():
+ with open('./data/my_data.csv'):
+ print('I can read the file !')
+
+
+I test.py, main.py and n0.ipynb files:
+from utils.reader import read_file
+read_file()
+
+To run s0 and main, I use the command: python -m scripts.s0 and python -m main and it works fine.
+But when I try to run the notebook, it does not. I understnd why (it looks at ./data/my_data.csv but since it's not at root level, it does not work)
+Is there a way to make file reading independent from entry point in python ?
+In javascript for instance, it would be esay to do, I would use path ../data/my_data.csv in reader file and it would work independently from the file location the function is called.
","You can find the path of the current file with the __file__ variable (also see this answer).
+From that, you can construct the required path. In reader.py:
+from pathlib import Path
+
+APPDIR = Path(__file__).parent.parent.resolve()
+
+def read_file()
+ with open(APPDIR / "data" / "my_data.csv"):
+ print('I can read the file !')
+
",python
+"How to plot number of events occurring at each hour of the day in a pandas dataframe?Say I have the following data:
+import pandas as pd
+data = {'time':[7, 1, 2, 7, 2, 2, 1, 2, 7, 3, 5], 'event':['a', 'b', 'a', 'a', 'b', 'a', 'a', 'b', 'b', 'b', 'a']}
+df = pd.DataFrame(data)
+
+I want to display how many events of each type occurred at each hour of the day. However, there are only 5 unique times present in the "time" column of the dataset.
+Plotting a histogram with bins=24 works when all the 24 unique hours of the day (1 to 24) are present in the dataset. But if only a few hours of the day are present, histogram doesn't do this task.
+For example, with the above data, the code df.hist() produces this chart:
+![]()
+It is unclear where the x-axis ticks are located exactly - what I want is, that the 5 spikes in this chart should be located at x = 1, 2, 3, 5 and 7, and there should be no spikes present at x = 4, 6 and 8 through 24.
+With df.time.hist(bins=24), the following chart is produced:
+![]()
+Here, it is a bit better as we can see that at least the first 4 spikes are located at x = 1, 2, 3, and 5, with x = 4 and x = 6 being left blank. However, at x=7, the spike is drawn to the left of the grid lines, while the other 4 spikes are drawn to the right of the grid lines. Also, this doesn't display the empty spikes at x = 8 through 24.
+So, how do I do it?
","Try this:
+import pandas as pd
+import matplotlib.pyplot as plt
+import numpy as np
+data = {'time':[7, 1, 2, 7, 2, 2, 1, 2, 7, 3, 5], 'event':['a', 'b', 'a', 'a', 'b', 'a', 'a', 'b', 'b', 'b', 'a']}
+df = pd.DataFrame(data)
+fig, axes = plt.subplots(nrows=1, ncols=1, figsize=(16, 10))
+
+df.hist(ax=axes, bins=range(24))
+
+# offset the xticks
+axes.set_xticks(np.arange(24) + .5)
+
+# name the label accordingly
+axes.set_xticklabels(range(24))
+
",python
+"Python Overridden __add__ Doesn't Work for a += bI've been working on a Vector object class to allow me to solve math problems quickly using Python.
+For Example:
+If we set vector = Vector(1, 2, 3) and we run vector *= 2, then vector will be equal to <2, 4, 6>
+The Issue: Whenever I run vector += 2, it errors saying
+Traceback (most recent call last):
+ File "C:\Users\User\PycharmProjects\Math\matrices.py", line 149, in <module>
+ main()
+ File "C:\Users\User\PycharmProjects\Math\matrices.py", line 139, in main
+ vector += 2
+TypeError: 'int' object is not iterable
+
+Process finished with exit code 1
+
+Here is my Vector class
+class Vector(list):
+ def __init__(self, *args):
+ super().__init__()
+ for arg in args:
+ self.append(arg)
+
+ def __mul__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ return Vector(*[row * other for row in self])
+
+ def __truediv__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ return Vector(*[row / other for row in self])
+
+ def __add__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ print(type(other))
+ return Vector(*[row + other for row in self])
+
+ def __sub__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ return Vector(*[row - other for row in self])
+
+ def __str__(self):
+ return "<" + ", ".join([str(e) for e in self]) + ">"
+
+Here is my main function
+def main():
+ vector = Vector(1, 2, 3)
+ vector *= 2
+ print(vector)
+ vector += 2
+ print(vector)
+
+What I've done to find the solution:
+
+- I've tried to do
vector = vector + 2 and that works, but I would
+rather understand why it's not working and be able to fix it.
+- I've tried googling how operator overloading works in Python
+- I've tried looking through operator overloading threads on StackOverflow, but I don't see any that answer my question. They're all about how a = a + b, not a += b.
+- I've tried using iadd, but it doesn't work.
+
","Just add the line __iadd__ = __add__:
+class Vector(list):
+ def __init__(self, *args):
+ super().__init__()
+ for arg in args:
+ self.append(arg)
+
+ def __mul__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ return Vector(*[row * other for row in self])
+
+ def __truediv__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ return Vector(*[row / other for row in self])
+
+ def __add__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ print(type(other))
+ return Vector(*[row + other for row in self])
+
+ __iadd__ = __add__
+
+ def __sub__(self, other):
+ assert isinstance(other, float) or isinstance(other, int)
+ return Vector(*[row - other for row in self])
+
+ def __str__(self):
+ return "<" + ", ".join([str(e) for e in self]) + ">"
+
+
+vector = Vector(1, 2, 3)
+vector *= 2
+print(vector)
+vector += 2
+print(vector)
+
+Prints:
+<2, 4, 6>
+<class 'int'>
+<4, 6, 8>
+
",python
+"Selection of columnsI work with Pandas dataframe.I want to aggregate data by one column and after that to summarize other columns.You can see example below:
+ data = {'name': ['Company1', 'Company2', 'Company1', 'Company2', 'Company5'],
+ 'income': [0, 180395, 4543168, 7543168, 73],
+ 'turnover': [4, 24, 31, 2, 3]}
+ df = pd.DataFrame(data, columns = ['name', 'income', 'turnover'])
+ df
+
+INCOME_GROUPED = df.groupby(['name']).agg({'income':sum,'turnover':sum})
+
+So this code above work well and give good result. Now next step is selection. I want to select only to columns from INCOME_GROUPED dataframe.
+INCOME_SELECT = INCOME_GROUPED[['name','income']]
+
+But after execution this line of code I got this error:
+"None of [Index(['name', 'income'], dtype='object')] are in the [columns]"
+
+So can anybody help me how to solve this problem ?
","You need to call reset_index() after agg():
+INCOME_GROUPED = df.groupby(['name']).agg({'income':sum,'turnover':sum}).reset_index()
+# ^^^^^^^^^^^^^^ add this
+
+Output:
+>>> INCOME_GROUPED[['name', 'income']]
+ name income
+0 Company1 4543168
+1 Company2 7723563
+2 Company5 73
+
",python
+"python: write a dataframe groupby to a fileI have a file with the following that I am reading with python
+Item Master Primary Spec/Common Information/Contract Number||Contract Master Primary Spec/cage code
+8AND3||SP47W117D0015
+8AND3||SP47W117D0015
+8AND3||SP47W117D0015
+8AND3||SP47W117D0015
+8AND3||SP47W117D0015
+8C1C2||N6247820D2401
+8C1C2||N6247820D2401
+8C1C2||N6247820D2401
+
+I am trying to get a count of the number of contracts. The below code seems to work when I print it (although the header columns are reversed for some reason), but not when I try to output it to a file.
+import pandas as pd
+
+fname="mdm.export.item.master.delta.1335.20220120011500_125_125.csv"
+fdir="./data/"
+df = pd.read_csv(fdir+fname, sep='\|\|', keep_default_na=False, engine='python')
+
+uniqContract=df.groupby(['Item Master Primary Spec/Common Information/Contract Number']).count()
+print(uniqContract)
+
+file = open("testfile.txt","w")
+for items in uniqContract:
+ file.writelines(items+'\n')
+file.close()
+
+This is the print output
+(base) PS D:\02-MyLocalFiles> python .\helloworld.py
+Contract Master Primary Spec/cage code Item Master Primary Spec/Common Information/Con...
+8AND3 5
+8C1C2 3
+(base) PS D:\02-MyLocalFiles\python\backlog_report>
+
+But this is the output to the file
+Contract Master Primary Spec/cage code
+
+What am I doing wrong?
","uniqContract=df.groupby(['Item Master Primary Spec/Common Information/Contract Number']).count().reset_index()
+uniqContract.to_csv('testfile.txt', delimiter='\t')
+
+You can call .reset_index() on your groupby count aggregation, then write that to a (text) csv file directly. You can specify the delimiter (this is tabs).
",python
+"Implement mask for anchor-negative in list of tripletsI have a generator in producing my TensorFlow data, as triplets (anchor, positive, negative), in batches. Each batch is a list of such triplets, making up the list labels. Using code from Moindrot's blog on triplet loss we get a mask for positives and negatives: With
+labels_equal = tf.equal(tf.expand_dims(labels, 0), tf.expand_dims(labels, 1))
+mask = tf.logical_not(labels_equal)
+
+we get a mask of negatives (non-equal labels). If I have labels=[1,1,2,3,3,4] the mask will be:
+[[ F F T T T T ],
+ [ F F T T T T ],
+ [ T T F T T T ],
+ [ T T T F F T ],
+ [ T T T F F T ],
+ [ T T T T T F ]]
+
+The labels are set up such that in groups of 3 they form anchor, positive and negative (triplet).
+I'm trying to implement a negative mask for balanced triplets. I.e. a mask where only the negative to its anchor is True for each line. How do I find this negative mask such that only the anchor-negative entries percolate through?
+Expected output for labels=[1,1,2,3,3,4]:
+[[ F F T F F F ],
+ [ F F F F F F ],
+ [ T F F F F F ],
+ [ F F F F F T ],
+ [ F F F F F F ],
+ [ F F F T F F ]]
+
+
+Notes:
+
+- I have tried to use tfa's triplet loss, but it's not balanced (bad for testing accuracy, recall, etc.).
+- Labels have been relabeled, such that the labels of each triplet is batch-unique and won't match other triplets in the batch.
+
","If you really only want anchor-negative True in the negative mask, you can accomplish this with a mix of tf.tile, tf.linalg.band_part and tf.transpose:
+labels_equal = tf.equal(tf.expand_dims(labels, 0), tf.expand_dims(labels, 1))
+mask = tf.logical_not(labels_equal)
+triplet_mask = tf.tile(
+ [
+ [False, False, True],
+ [False, False, False],
+ [False, False, False],
+ ],
+ multiples=[d // 3 for d in mask.shape],
+)
+triplet_mask = tf.linalg.band_part(triplet_mask, 0, 2)
+triplet_mask = triplet_mask | tf.transpose(triplet_mask)
+
+I would suggest, however, you rather change the model to accept a list of triplets. It simplifies both training and testing.
",python
+"Selenium Python >> get a value of an attribute from the console log of the webpageHow do we get a value of an attribute from the console log of the webpage?enter image description here For example in the screenshot (link above) I have highlighted the following:
+page_renderer: "articleRenderer"
+When the webpage loads I would like to verify if the value for page_renderer is "articleRenderer" or if it is "articleRenderer2".
+Could you please let me know? Thanks.
","Based on your screenshot:
+page_renderer_value = driver.execute_script('return rtech.visitor.page_renderer')
+
",python
+"Iterating within groups until a column changes in pandasI have the following input df:
+ domain ip timestamp
+0 Google 101 2020-04-01 23:01:41
+1 Google 101 2020-04-01 23:01:59
+2 Google 101 2020-04-02 12:01:41
+3 Facebook 101 2020-04-02 13:11:33
+4 Facebook 101 2020-04-02 13:11:35
+5 Youtube 103 2020-04-21 13:01:41
+6 Youtube 103 2020-04-21 13:11:46
+7 Youtube 103 2020-04-22 01:01:01
+8 Google 103 2020-04-22 02:11:23
+9 Facebook 103 2020-04-23 14:11:13
+10 Youtube 103 2020-04-23 14:11:55
+
+How can I get this output? Where domain_num is an iterator that increases everytime a domain switches within an IP.
+
+ domain ip timestamp domain_num
+0 Google 101 2020-04-01 23:01:41 1
+1 Google 101 2020-04-01 23:01:59 1
+2 Google 101 2020-04-02 12:01:41 1
+3 Facebook 101 2020-04-02 13:11:33 2
+4 Facebook 101 2020-04-02 13:11:35 2
+5 Youtube 103 2020-04-21 13:01:41 1
+6 Youtube 103 2020-04-21 13:11:46 1
+7 Youtube 103 2020-04-22 01:01:01 1
+8 Google 103 2020-04-22 02:11:23 2
+9 Facebook 103 2020-04-23 14:11:13 3
+10 Youtube 103 2020-04-23 14:11:55 4
+
+I tried something like this which gets the counts but I need to group it by ip
+df['domain'].ne(df['domain'].shift()).cumsum()
+
+This code below errors out
+df.groupby('ip').apply(lambda x : x[x.domain.ne(x.domain.shift().cumsum())])
+
+Data
+import pandas as pd
+
+data = {'domain':['Google', 'Google', 'Google', 'Facebook', 'Facebook', 'Youtube', 'Youtube', 'Youtube', 'Google', 'Facebook', 'Youtube'],
+ 'ip':[101, 101, 101, 101, 101, 103, 103, 103, 103, 103, 103],
+ 'timestamp' : ['2020-04-01 23:01:41', '2020-04-01 23:01:59', '2020-04-02 12:01:41', '2020-04-02 13:11:33',
+ '2020-04-02 13:11:35', '2020-04-21 13:01:41', '2020-04-21 13:11:46',
+ '2020-04-22 01:01:01', '2020-04-22 02:11:23','2020-04-23 14:11:13', '2020-04-23 14:11:55' ]}
+
+df = pd.DataFrame(data)
+df['timestamp']= pd.to_datetime(df['timestamp'])
+
","Assume your dataframe is sorted by timestamp column:
+inc_domain_num = lambda x: x.ne(x.shift()).cumsum()
+df['domain_num'] = df.groupby('ip')['domain'].apply(inc_domain_num)
+print(df)
+
+# Output
+ domain ip timestamp domain_num
+0 Google 101 2020-04-01 23:01:41 1
+1 Google 101 2020-04-01 23:01:59 1
+2 Google 101 2020-04-02 12:01:41 1
+3 Facebook 101 2020-04-02 13:11:33 2
+4 Facebook 101 2020-04-02 13:11:35 2
+5 Youtube 103 2020-04-21 13:01:41 1
+6 Youtube 103 2020-04-21 13:11:46 1
+7 Youtube 103 2020-04-22 01:01:01 1
+8 Google 103 2020-04-22 02:11:23 2
+9 Facebook 103 2020-04-23 14:11:13 3
+10 Youtube 103 2020-04-23 14:11:55 4
+
",python
+"RPM installation Trino throws python dependencyI'm trying to install Trino using RPM on Red Hat Enterprise Linux distribution. I install the Trino dependencies using the following commands:
+$ sudo yum update -y
+$ sudo yum install -y java-11-openjdk.x86_64 python3
+$ sudo alternatives --set python /usr/bin/python3
+
+Then I try to install Trino from archive in single-node mode. This however gives a dependency error:
+$ sudo rpm -i trino-server-rpm-368.rpm
+error: Failed dependencies:
+ python >= 2.4 is needed by trino-server-rpm-0:368-1.noarch
+
+This error doesn't make sense to me given that this dependency is actually satisfied when checking my python version:
+$ python -V
+Python 3.6.8
+
","An answers has been provided by @hashhar on this Github Issue if you actually have the correct dependencies installed:
+$ sudo rpm -i --nodeps trino-server-rpm-368.rpm
+
",python
+"Sorting Algorithm output at end of pass 3Given the following initially unsorted list:
+[77, 101, 40, 43, 81, 129, 85, 144]
+Which sorting algorithm produces the following list at the end of Pass Number 3? Is it Bubble, Insertion or Selection?
+[40, 43, 77, 81, 85, 101, 129, 144]
+Can someone give me a clue on how I can solve this please.
","Insertion sort:
+def insertion_sort(array):
+ for i in range(1, len(array)):
+ key_item = array[i]
+ j = i - 1
+ while j >= 0 and array[j] > key_item:
+ array[j + 1] = array[j]
+ j -= 1
+ array[j + 1] = key_item
+ print("Step",i,":",array)
+ return array
+
+data=[77, 101, 40, 43, 81, 129, 85, 144]
+insertion_sort(data)
+
+Output:
+Step 1 : [77, 101, 40, 43, 81, 129, 85, 144]
+Step 2 : [40, 77, 101, 43, 81, 129, 85, 144]
+Step 3 : [40, 43, 77, 101, 81, 129, 85, 144]
+Step 4 : [40, 43, 77, 81, 101, 129, 85, 144]
+Step 5 : [40, 43, 77, 81, 101, 129, 85, 144]
+Step 6 : [40, 43, 77, 81, 85, 101, 129, 144]
+Step 7 : [40, 43, 77, 81, 85, 101, 129, 144]
+
+Bubble sort:
+def bubble_sort(array):
+ n = len(array)
+ for i in range(n):
+ already_sorted = True
+ for j in range(n - i - 1):
+ if array[j] > array[j + 1]:
+ array[j], array[j + 1] = array[j + 1], array[j]
+ already_sorted = False
+ if already_sorted:
+ break
+ print("Step:",n-j-1)
+ print(array)
+ return array
+
+data = [77, 101, 40, 43, 81, 129, 85, 144]
+bubble_sort(data)
+
+Output:
+Step: 1
+[77, 40, 43, 81, 101, 85, 129, 144]
+Step: 2
+[40, 43, 77, 81, 85, 101, 129, 144]
+
+Selection Sort:
+def selectionSort(array, size):
+ for step in range(size):
+ min_idx = step
+
+ for i in range(step + 1, size):
+ if array[i] < array[min_idx]:
+ min_idx = i
+ (array[step], array[min_idx]) = (array[min_idx], array[step])
+ print("step",step+1,":",end="")
+
+ print(array)
+
+data = [77, 101, 40, 43, 81, 129, 85, 144]
+size = len(data)
+selectionSort(data, size)
+
+Output:
+step 1 :[40, 101, 77, 43, 81, 129, 85, 144]
+step 2 :[40, 43, 77, 101, 81, 129, 85, 144]
+step 3 :[40, 43, 77, 101, 81, 129, 85, 144]
+step 4 :[40, 43, 77, 81, 101, 129, 85, 144]
+step 5 :[40, 43, 77, 81, 85, 129, 101, 144]
+step 6 :[40, 43, 77, 81, 85, 101, 129, 144]
+step 7 :[40, 43, 77, 81, 85, 101, 129, 144]
+step 8 :[40, 43, 77, 81, 85, 101, 129, 144]
+
+You can also get more guidelines from the link below how to run algorithms:
+https://realpython.com/sorting-algorithms-python/
",python
+"How can I convert a gpa calculator to use upper and lower case grade arguments?I am creating a GPA calculator which takes in 4 grades and gives the GPA. I want it to be able to use upper case letters 'A' and lower case letters 'a' and so on through 'F' 'f'. Right now it will only use upper case letters. How can I convert it to use both without changing the dictionary?
+import sys
+def gpa_calculator(grade1, grade2, grade3, grade4):
+ points = 0
+ i = 0
+ grade_c ={'A':4.0, 'A-':3.66, 'B+':3.33, 'B':3.0,
+ 'B-':2.66,'C+':2.33,'C':2.0,'C-':1.66,'D+':1.33,'D':1.00,'D-':.66,'F':0.00}
+
+ if grades != []:
+ for grade in grades:
+ points += grade_c[grade]
+ gpa = points / len(grades)
+ return gpa
+ else:
+ return None
+
+ grade1 = sys.argv[1]
+ grade2 = sys.argv[2]
+ grade3 = sys.argv[3]
+ grade4 = sys.argv[4]
+
+ grades = grade1, grade2, grade3, grade4
+ grades = gpa_calculator(grade1, grade2, grade3, grade4)
+ print('My GPA is',(grades))
+
","You have at least two options
+1 Add the "lower case" version of the keys to the dictionary as the dictionary key is case-sensitive. Thereby creating a larger dictionary and somewhat unnecessary repeated code.
+grade_c = {'A':4.0, 'A-':3.66, 'B+':3.33, 'B':3.0, 'B-':2.66,'C+':2.33,'C':2.0,'C-':1.66,'D+':1.33,'D':1.00,'D-':.66,'F':0.00, 'a':4.0, 'a-':3.66, 'b+':3.33, 'b':3.0, 'b-':2.66,'c+':2.33,'c':2.0,'c-':1.66,'d+':1.33,'d':1.00,'d-':.66,'f':0.00 }
+
+
+- Keep your dictionary as is and take the passed in command line arguments and use the
str.upper() method to convert and store them as uppercase. The user can pass in either any case and it will be converted to uppercase for handling by your current dictionary and code.
+
+ grade1 = sys.argv[1].upper()
+ grade2 = sys.argv[2].upper()
+ grade3 = sys.argv[3].upper()
+ grade4 = sys.argv[4].upper()
+
",python
+"Python: How to dynamically set inner class class variableI am working with peewee and SQLite on a project. I have two files:
+file1.py
+if __name__ == '__main__':
+ db = Thing()
+
+file2.py
+DB_FILENAME = 'db_name.db'
+DB_FILE_PATH = f'/tmp/{DB_FILENAME}'
+db = SqliteDatabase(DB_FILE_PATH)
+
+class Thing(Model):
+ field1 = CharField(primary_key=True)
+ field2 = CharField()
+
+ class Meta:
+ database = db
+
+The Thing class needs to be able to write into several different databases so I need a clean way of making DB_FILENAME configurable so that the inner class database field will be initialized to the appropriate name.
+The inner Meta class appears to be required and this setup is what is referenced in the documentation (the db declared globally).
+Additional Note: This code is being run in AWS Lambda.
+EDIT: Working solution in fact did not work.
","Write a function that creates the class, and call it with the environment variable as an argument.
+# file2.py
+
+def make_model(db_file_name):
+ class Thing(Model):
+ field1 = CharField(primary_key=True)
+ field2 = CharField()
+
+ class Meta:
+ database = SqliteDatabase(f'/tmp/{db_file_name}')
+
+ return Thing
+
+
+# script
+
+from file2 import make_model
+
+if __name__ == '__main__':
+ db = make_model(f'db_name_{os.getenv("SUFFIX")}.db')
+
",python
+"Python Selenium Do-While LoopI am trying to complete a do-while loop with the below code. We are waiting for a report to process, and it only shows on the webpage once completed and after clicking the retrieve button. The code,
+# Go to list and click retrieve
+driver.find_element(By.CSS_SELECTOR, "#j_idt40Lbl").click()
+time.sleep(20) # takes a while for their side to run
+retrieve = driver.find_element_by_css_selector("#tab_reportList\:listPgFrm\:listPgRetrBtn")
+retrieve.click() ### Do this action ###
+time.sleep(5)
+retrieve.click()
+time.sleep(5)
+retrieve.click()
+time.sleep(3)
+
+# Click report file
+WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#tab_reportList\:listPgFrm\:listDataTable_0\:0\:reportLink"))).click()
+### Until this is visible ###
+
","In case the element finally becoming visible is not initially presents on the page your code can be something like this:
+driver.find_element(By.CSS_SELECTOR, "#j_idt40Lbl").click()
+time.sleep(20)
+while True:
+ driver.find_element_by_css_selector("#tab_reportList\:listPgFrm\:listPgRetrBtn").click()
+ time.sleep(5)
+ if driver.find_elements_by_css_selector("#tab_reportList\:listPgFrm\:listDataTable_0\:0\:reportLink"):
+ WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#tab_reportList\:listPgFrm\:listDataTable_0\:0\:reportLink"))).click()
+ break
+
+Hardcoded sleeps of 5 and 20 seconds are not looking good. If possibly they should be changed by Expected Conditions explicit waits
",python
+"Pandas dataframe custom columns and rowsI have been stuck here I don't know why but this is the code:
+import pandas as pd
+import random as R
+n = int(input("How many columns : "))
+list1 = []
+list2=[]
+nilaiRand = R.randint(1,30)
+for i in range(0,n):
+ print("Column-",i+1 ,sep="",end=": ")
+ kolom = input()
+ list1.append(kolom)
+print(list1)
+ind = int(input("Total random number : "))
+inde = range(1,ind+1)
+list2.append(nilaiRand)
+print(list2)
+dataframe = pd.DataFrame(list1,list2)
+
+'list1' is the name of the columns, 'list2' is the random number which is the rows.
+I need to input how many columns you want to make,and it will request the name of each columns and listed in 'list1'. and I want to make the rows as much as ind = int(input("Total random number : ")) and the fill of the line is random number nilaiRand = R.randint(1,30)
+The output should be like
+How many columns : 3
+Column-1 : Eli
+Column-2 :Chick
+Column-3 :You
+Total random number : 4
+
+Result
+
+ Eli Chick You
+1 12 22 3
+2 21 12 11
+3 4 11 21
+4 13 14 5
+
+Any solution?
","This should do the trick:
+import pandas as pd
+import random as R
+n = int(input("How much columns: "))
+dict1 = {}
+for i in range(0,n):
+ column = input()
+ dict1[column] = []
+ind = int(input("Number of rows: "))
+for column in dict1.keys():
+ for _ in range(ind):
+ dict1[column].append(R.randint(1,30))
+dataframe = pd.DataFrame(dict1)
+
+Input:
+
+Output:
+ A B C D E
+0 21 1 6 1 1
+1 21 18 17 3 5
+2 25 3 6 17 23
+3 5 8 16 25 23
+4 6 16 22 30 27
+5 25 17 14 11 13
+6 13 26 30 24 11
+7 4 21 29 14 2
+
+However, please up your question-writing game. Do not upload images of code/errors when asking a question. Also give a clear desired output and tell us at what point in your code you stumble unto problems.
+Update
+If you want to start your index at 1, try:
+dataframe = pd.DataFrame(dict1, index=range(1, ind+1))
+
",python
+"How to convert a nested list of keys to a dummies-like dataframeHow to convert following list to a pandas dataframe?
+my_list = [["A","B","C"],["A","B","D"]]
+
+And as an output I would like to have a dataframe like:
+
+
+
+
+| Index |
+A |
+B |
+C |
+D |
+
+
+
+
+| 1 |
+1 |
+1 |
+1 |
+0 |
+
+
+| 2 |
+1 |
+1 |
+0 |
+1 |
+
+
+
+
","You can craft Series and concatenate them:
+my_list = [["A","B","C"],["A","B","D"]]
+
+df = (pd.concat([pd.Series(1, index=l, name=i+1)
+ for i,l in enumerate(my_list)], axis=1)
+ .T
+ .fillna(0, downcast='infer') # optional
+ )
+
+or with get_dummies:
+df = pd.get_dummies(pd.DataFrame(my_list))
+df = df.groupby(df.columns.str.split('_', 1).str[-1], axis=1).max()
+
+output:
+ A B C D
+1 1 1 1 0
+2 1 1 0 1
+
",python
+"Discord py Global Mute commandSo, I am trying to create a global mute / unmute command which would be used to mute a specific user in multiple servers. This command is being made only for a community that owns multiple servers so it would be useful to mute someone across all servers where the bot is running.
+I based this mute command on a global ban command which is able to ban a user from multiple servers where the bot is running.
+The first command "gban" in the below code runs beautifully, whenever I ban a user in one server, it bans them from any other servers the bot is running in. I am trying to see how I can use this same setup to create a global mute command. the command "gmute" is what I have created so far. It works, but it only mutes the user in the server the command is ran in and DOES NOT mute the user in all the servers that the bot is running in. I am trying to make it to where "gmute" is a global command and applies to all servers that the bot is running in.
+# global ban command
+
+async def gban(ctx, user: discord.User):
+ for guild in client.guilds:
+ await guild.ban(user)
+ embed = discord.Embed(title="**Global Ban:**", description=f"{user} Has been globally banned from all servers!:no_pedestrians:",colour=discord.Colour.light_gray())
+ await ctx.send(embed = embed)
+
+
+# Global Mute Command
+
+@client.command()
+@commands.has_permissions(manage_messages=True)
+async def gmute(ctx, user: discord.User):
+ for guild in client.guilds:
+ mutedRole = discord.utils.get(guild.roles, name="Coventry")
+
+ if not mutedRole:
+ mutedRole = await ctx.guild.create_role(name="Coventry")
+
+ for guild in client.guilds:
+ await guild.set_permissions(mutedRole, speak=False, send_messages=False, read_message_history=True, read_messages=False)
+ embed = discord.Embed(title="**Global Mute:**", description=f"**{user} has been globablly muted frm all servers! :zipper_mouth:**", colour=discord.Colour.light_gray())
+ await ctx.send(embed=embed)
+ await user.add_roles(mutedRole)
+ await user.send(f"**you have been globally muted!**") ```
+
","you have to add_roles inside the loop.
+Your indentation is wrong. the if not mutedRole: part should be inside the loop too.
+You are creating a new mute role in ctx.guild (the guild that the command was done in) and not in guild (the guild you got from the loop)
+Also, if you create a new mute role, you'd be setting it's permissions only in the guild the role is in, not in all the servers. Instead of doing 2 api calls (create_role and set_permissions), you can just use the permissions keyword argument in create_role. So instead of
+mutedRole = await ctx.guild.create_role(name="Coventry")
+
+for guild in client.guilds:
+ await guild.set_permissions(mutedRole, speak=False, send_messages=False, read_message_history=True, read_messages=False)
+
+You'd do
+mutedRole = await ctx.guild.create_role(name="Coventry", permissions=discord.Permissions(speak=False, send_messages=False, read_message_history=True, read_messages=False))
+
+Your final code will be
+@client.command()
+@commands.has_permissions(manage_messages=True)
+async def gmute(ctx, user: discord.User):
+ for guild in client.guilds:
+ mutedRole = discord.utils.get(guild.roles, name="Coventry")
+
+ if not mutedRole:
+ mutedRole = await guild.create_role(name="Coventry", permissions=discord.Permissions(speak=False, send_messages=False, read_message_history=True, read_messages=False))
+
+ member = guild.get_member(user.id)
+ await member.add_roles(mutedRole)
+
+ embed = discord.Embed(title="**Global Mute:**", description=f"**{user} has been globablly muted frm all servers! :zipper_mouth:**", colour=discord.Colour.light_gray())
+ await ctx.send(embed=embed)
+ await user.send(f"**you have been globally muted!**")
+
",python
+"find pairs of numbers where cube is equal to squareWe are given a number N and we have to find pairs i and j where i^3=j^2
+For example, let N=50 so for this we will have 3 pairs (1,1),(4,8),(9,27)
+basically, we have to find pairs where the cube of one number is the same as the square of the other number in a given pair
+the constraint is
+
+Naive approach use 2 for loops iterate through each element and get those pairs where cube is equal to sum time complexity is O(n*2)
+def get_pair(N):
+ for i in range(1,N):
+ for j in range(1,N):
+ if i*i*i==j*j:
+ print(i,j)
+N=50
+get_pair(N)
+
+what will be an optimal way to solve this problem with a better time complexity?
","Since you're working with integers, if there exists some number M = i^3 = j^2 for i and j between 1 and N, then that means there exists a k such that M = k^6. To find i and j, simply compare the representations of M:
+(1) M = k^6 = i^3 = (k^2)^3 therefore i = k^2
+(2) M = k^6 = j^2 = (k^3)^2 therefore j = k^3
+Since j is always greater than or equal to i, you only need to check if 1 < k^3 < N. In other words, k should be less than the cube root of N.
+
+
+
+
+| k |
+M = k^6 |
+i = k^2 |
+j = k^3 |
+
+
+
+
+| 2 |
+64 |
+4 |
+8 |
+
+
+| 3 |
+729 |
+9 |
+27 |
+
+
+| 4 |
+4,096 |
+16 |
+64 |
+
+
+| 5 |
+15,625 |
+25 |
+125 |
+
+
+| 6 |
+46,656 |
+36 |
+216 |
+
+
+| ... |
+... |
+... |
+... |
+
+
+| 97 |
+8.329x10^11 |
+9409 |
+912,673 |
+
+
+| 98 |
+8.858x10^11 |
+9604 |
+941,192 |
+
+
+| 99 |
+9.415x10^11 |
+9801 |
+970,299 |
+
+
+
+
+Note that 100 isn't a valid candidate for k because that would make j less than or equal to N instead of strictly less than N (if we're going with N = 10^6).
+So to get the list of tuples that satisfy your problem, find the values of k such that 1 < k^3 < N and return its square and cube in a tuple.
+import math
+from typing import List, Tuple
+
+N: int = 10**6
+pairs: List[Tuple[int, int]] = [(k * k, k * k * k) for k in range(2, math.ceil(N**(1 / 3)))]
+print(pairs)
+
+This is a list comprehension, a shorthand for a for-loop.
+I'm basically asking Python to generate a list of tuples over an index k that falls in the range defined as range(2, math.ceil(N**(1 / 3)). That range is exactly the first column of the table above.
+Then, for every k in that range I make a tuple of which the first item is k^2 and the second item is k^3, just like the last two columns of the table.
+Also threw in the typing library in there for good measure. Clear code is good code, even for something as small as this. Python can figure out that pairs is a list of tuples without anyone telling it, but this way I've explicitly enforced that variable to be a list of tuples to avoid any confusion when someone tries to give it a different value or isn't sure what the variable contains.
",python
+"Can't update plotly go.Table column order and cell color using dropdown buttonsI have two pandas dataframes, scores containing a set of scores I would like to display in a table, and colours mapping a set of colours I would like the cells in the table to be. Both share the same column headers and index.
+I am trying to generate a dropdown menu for a table created in plotly using go.Table. Without dropdown options I am able to set up a table with cells coloured according to their value in the following way:
+ table = go.Figure(data=[go.Table(
+ header=dict(values= scores_cols,
+ fill_color='paleturquoise',
+ align='center'),
+ cells=dict(values= [scores[x] for x in scores_cols],
+ fill_color=[colours[x] for x in scores_cols],
+ align='center'))
+ ])
+
+In order to add dropdown buttons to sort columns and their corresponding cell colours I am using the following:
+buttons = []
+
+for score in scores_cols:
+ scores = scores.sort_values(by = [score])
+ colours = colours.reindex(scores.index)
+ buttons.append(dict(
+ label = score,
+ method = 'restyle',
+ args = [
+ {"cells":
+ {"values": [scores[x] for x in scores_cols],
+ "fill_color": [colours[x] for x in scores_cols]},
+ }]))
+table.update_layout(
+ updatemenus=[
+ dict(
+ buttons=buttons,
+ direction="down",
+ pad={"r": 10, "t": 10},
+ showactive=True,
+ x = 0.01,
+ y = 1.3,
+ xanchor="left",
+ yanchor="top")
+ ])
+
+The colour mapping works for the initial table, however when a dropdown button is clicked, the table is updated to reflect the new row ordering according to the selected column, but no colour mapping is present at all. Any advice with this would be great as I am running out of ideas !
","Found the answer to this. When updating the fill colour, a dict of params for the fill should be passed with fill as one of them, rather than trying to directly assign fill_color:
+ buttons.append(dict(
+ label = score,
+ method = 'restyle',
+ args = [
+ {"cells": {
+ "values": [sorted_scores[x] for x in displayed_scores_cols],
+ "fill": dict(color = [sorted_colours[x] for x in displayed_scores_cols])}}]))
+
",python
+"Concatenate pyton tuples in numbaI'm looking to fill up an arrays of zeros with numbers taken from some tuples, easy as that.
+Now normally this is not a problem even when the tuples are not the same length (witch is the point here). but it seems it wont compile and I cannot figure out a solution.
+from numba import jit
+
+def cant_jit(ls):
+
+ # Array total lenth
+ tl = 6
+ # Type
+ typ = np.int64
+
+ # Array to modify and return
+ start = np.zeros((len(ls), tl), dtype=typ)
+
+ for i in range(len(ls)):
+
+ a = np.array((ls[i]), dtype=typ)
+ z = np.zeros((tl - len(ls[i]),), dtype=typ)
+ c = np.concatenate((a, z))
+ start[i] = c
+
+ return start
+
+# Uneven tuples would be no problem in vanilla
+cant_jit(((2, 4), (6, 8, 4)))
+
+
+jt = jit(cant_jit)
+# working fine
+jt(((2, 4), (6, 8)))
+# non working
+jt(((2, 4), (6, 8, 4)))
+
+Within the error.
+getitem(Tuple(UniTuple(int64 x 3), UniTuple(int64 x 2)), int64)
+There are 22 candidate implementations:
+- Of which 22 did not match due to:
+Overload of function 'getitem': File: : Line N/A.
+With argument(s): '(Tuple(UniTuple(int64 x 3), UniTuple(int64 x 2)), int64)':
+No match.
+I tried some things here to no success. Does someone know a way around this so the function can be compiled and still do it's thing ?
","This isn't possible as far as I can tell, numba documentation tells us that nested tuples that aren't of equal length aren't legal unless you use forceobj=True. You can't even unpack *args which is frustrating. You will always receive that warning/error:
+Just add that argument to jit() like this:
+
+
+
from numba import jit
+import numpy as np
+
+def cant_jit(ls):
+
+ # Array total lenth
+ tl = 6
+ # Type
+ typ = np.int64
+
+ # Array to modify and return
+ start = np.zeros((len(ls), tl), dtype=typ)
+
+ for i in range(len(ls)):
+
+ a = np.array((ls[i]), dtype=typ)
+ z = np.zeros((tl - len(ls[i]),), dtype=typ)
+ c = np.concatenate((a, z))
+ start[i] = c
+
+ return start
+
+# Uneven tuples would be no problem in vanilla
+cant_jit(((2, 4), (6, 8, 4)))
+
+
+jt = jit(cant_jit, forceobj=True)
+# working fine
+jt(((2, 4), (6, 8)))
+# now working
+jt(((2, 4), (6, 8, 4)))
+
+
+
+This works but it's kind of pointless and you may as well use core python.
",python
+"Replace each line containing a specific word from the first file with the line from the second fileI have two txt files, the first one contains duplicated word "PACKAGES", i want to replace each "PACKAGES" word with a line from file 2
+Ex of file 1:
+NEW FUSTAT TOURS
+City
+USA
+Address
+napolean
+PACKAGES
+Test TOURS
+City
+UK
+Address
+napolean
+PACKAGES
+
+Ex of file 2:
+First Company
+Second Company
+
+Expected Output:
+NEW FUSTAT TOURS
+City
+USA
+Address
+napolean
+First Company
+Test TOURS
+City
+UK
+Address
+napolean
+Second Company
+
+I tried:
+with open("file1.txt", encoding="utf-8") as first, open("file2.txt", encoding="utf-8") as second:
+ first_file = first.read()
+ second_file = second.readline()
+ print(first_file.replace('PACKAGES', second_file))
+
+Result:
+NEW FUSTAT TOURS
+City
+USA
+Addres
+napolean
+Phone Number
+************
+Email
+***************
+First Company
+Test TOURS
+City
+UK
+Addres
+napolean
+Phone Number
+************
+Email
+***************
+First Company
+
+Any kind of help please?
","Here's one way you could do it:
+with open('file1.txt', encoding='utf-8') as f1:
+ with open('file2.txt', encoding='utf-8') as f2:
+ f2lines = iter(f2.readlines())
+ for f1line in f1:
+ if f1line.startswith('PACKAGES'):
+ print(next(f2lines), end='')
+ else:
+ print(f1line, end='')
+
+Output:
+NEW FUSTAT TOURS
+City
+USA
+Address
+napolean
+First Company
+Test TOURS
+City
+UK
+Address
+napolean
+Second Company
+
",python
+"How to apply recursion over this problem and solve this problemThe Problem is:-
+Given a digit string, return all possible letter combinations of each digits according to the buttons on a telephone, that the number could represent.
+The returned strings must be lexicographically sorted.
+
+Example-1 :-
+Input : “23”
+Output : ["ad", "ae", "af", "bd", "be", "bf", "cd", "ce", "cf"]
+
+Example-2 :-
+Input : “9”
+Output: [“w”, “x”, “y”, “z”]
+
+Example-3 :-
+Input : “246”
+Output : ["agm", "agn", "ago", "ahm", ..., "cho", "cim", "cin" "cio"] {27 elements}
+
+I've squeezed my brain on this, and I've tried a lot but I'm not getting ahead of this part, what I've tried is to use a recursive function that zips the individual letters of each digit with each other letters and use itertools.combinations() over it, but I'm unable to complete this function and I'm unable to get ahead of this.
+What I've tried is :-
+times, str_res = 0, ""
+
+def getval(lst, times):
+ if times==len(lst)-1:
+ for i in lst[times]:
+ yield i
+ else:
+ for i in lst[times]:
+ yield i + getval(lst, times+1)
+
+dct = {"2":("a","b","c"), "3":("d","e","f"), "4":("g","h","i"),
+ "5":("j","k","l"), "6":("m","n","o"), "7":("p","q","r","s"),
+ "8":("t","u","v"), "9":("w","x","y","z"), "1":("")}
+
+str1, res = "23", []
+
+if len(str1)==1:
+ print(dct[str1[0]])
+else:
+ temp = [dct[i] for i in str1]
+ str_res = getval(temp, times)
+ print(str_res)
+
+Please suggest me your ideas over this problem or in completing the function...
","It's not itertools.combinations that you need, it's itertools.product.
+from itertools import product
+
+def all_letter_comb(s, dct):
+ for p in product(*map(dct.get, s)):
+ yield ''.join(p)
+
+dct = {"2":("a","b","c"), "3":("d","e","f"), "4":("g","h","i"),
+ "5":("j","k","l"), "6":("m","n","o"), "7":("p","q","r","s"),
+ "8":("t","u","v"), "9":("w","x","y","z"), "1":("")}
+
+for s in ['23', '9', '246']:
+ print(s)
+ print(list(all_letter_comb(s, dct)))
+ print()
+
+Output:
+23
+['ad', 'ae', 'af', 'bd', 'be', 'bf', 'cd', 'ce', 'cf']
+
+9
+['w', 'x', 'y', 'z']
+
+246
+['agm', 'agn', 'ago', 'ahm', 'ahn', 'aho', 'aim', 'ain', 'aio', 'bgm', 'bgn', 'bgo', 'bhm', 'bhn', 'bho', 'bim', 'bin', 'bio', 'cgm', 'cgn', 'cgo', 'chm', 'chn', 'cho', 'cim', 'cin', 'cio']
+
",python
+"In Python, How do you use a loop to create a dataframe?I recently pulled data from youtube API, and I'm trying to create a data frame using that information.
+When I loop through each item with the "print" function, I get 25 rows output for each variable (which is what I want in the data frame I create).
+How can I create a new data frame that contains 25 rows using this information instead of just 1 line in the data frame?
+When I loop through each item like this:
+df = pd.DataFrame(columns = ['video_title', 'video_id', 'date_created'])
+#For Loop to Create columns for DataFrame
+
+x=0
+
+while x < len(response['items']):
+ video_title= response['items'][x]['snippet']['title']
+ video_id= response['items'][x]['id']['videoId']
+ date_created= response['items'][x]['snippet']['publishedAt']
+ x=x+1
+
+#print(video_title, video_id)
+df = df.append({'video_title': video_title,'video_id': video_id,
+ 'date_created': date_created}, ignore_index=True)
+
+=========ANSWER UPDATE==========
+THANK YOU TO EVERYONE THAT GAVE INPUT !!!
+The code that created the Dataframe was:
+import pandas as pd
+
+x=0
+video_title = []
+video_id = []
+date_created = []
+
+while x < len(response['items']):
+ video_title.append (response['items'][x]['snippet']
+ ['title'])
+ video_id.append (response['items'][x]['id']['videoId'])
+ date_created.append (response['items'][x]['snippet'].
+ ['publishedAt'])
+ x=x+1
+
+#print(video_title, video_id)
+df = pd.DataFrame({'video_title': video_title,'video_id':
+video_id, 'date_created': date_created})
+
","Based on what I know about youtube APIs return objects, the values of 'title' , 'videoId' and 'publishedAt' are strings.
+A strategy of making a single df from these strings are:
+
+- Store these strings in a list. So you will have three lists.
+- Convert the lists into a df
+
+You will get a df with x rows, based on x values that are retrieved.
+Example:
+import pandas as pd
+
+x=0
+video_title = []
+video_id = []
+date_created = []
+
+while x < len(response['items']):
+ video_title.append (response['items'][x]['snippet']['title'])
+ video_id.append (response['items'][x]['id']['videoId'])
+ date_created.append (response['items'][x]['snippet']['publishedAt'])
+ x=x+1
+
+#print(video_title, video_id)
+df = pd.DataFrame({'video_title': video_title,'video_id':
+video_id, 'date_created': date_created})
+
",python
+"Explanation about Garbage collection pythonHi im not sure if this is related to garbage collection in python but im looking for some guidance in how it works under the hood.
+Below is a part of my program.
+def get_data():
+
+ templist = []
+
+ '''
+ does stuff to fill templist with newest data
+ '''
+
+ return templist
+
+def save_data(new_list, old_list):
+
+ '''
+ loops to check for updates.
+ if update, write to file
+ '''
+ if not old_list:
+ for n in new_list:
+ write_file(n)
+ else:
+ for n, o in zip(new_list,old_list):
+ if n[1] != o[1]:
+ write_file(n)
+ return new_list
+
+comparelist = []
+
+while True:
+
+ newlist = get_data()
+ comparelist = (save_data(newlist, comparelist))
+
+I have checked with the id() function, and newlist gets passed as a new object each time to newlist, which then gets passed to comparelist (still memomery reference to templist.
+My questions is this:
+
+- each lap, is the lists that are no longer used destroyed?
+- Is there a better way to do this?
+
","How and when objects are collected is generally a non issue, you should only care about how long you keep strong references to them around, which is mostly solved by scoping and knowing which objects you are keeping alive yourself.
+When the object's reference count drops to zero, it's marked for deletion, when that deletion occurs is implementation dependent. In the reference CPython implementation, it runs immediately after it reaches 0 https://docs.python.org/3/c-api/refcounting.html#c.Py_DECREF
+To save on a bit of time CPython will keep a few lists alive after they are deallocated, to be reused later to prevent creating completely new objects every time, so there's really nothing to worry about.
",python
+"Data Manipulation in multiple columns(absolute, percentage, and categorical) in pandas dataframeI need to make a function, which takes input as dataframe, and dictionary{"Col_1" :% change,"Col_2":absolute change,"Col_3": 0/1(Categorical)} and it should make the changes to the dataframe.
+I Have data frame like this
+
+
+
+
+| Date |
+col_1 |
+col_2 |
+col_3 |
+
+
+
+
+| 01/01/2022 |
+90 |
+100 |
+0 |
+
+
+| 01/02/2022 |
+80 |
+110 |
+1 |
+
+
+| 01/03/2022 |
+92 |
+120 |
+0 |
+
+
+| 01/04/2022 |
+96 |
+130 |
+0 |
+
+
+| 01/05/2022 |
+99 |
+150 |
+1 |
+
+
+| 01/06/2022 |
+105 |
+155 |
+1 |
+
+
+
+
+Now I pass the dictionary say,
+{"Date":["01/01/2022","01/02/2022"],"col_1":[-10,-10],"col_2":10,"col_3":[1,0]}
+
+
+- for "col_1" I am passing -10,-10 percentage change to its previous values on specified date.
+- for "col_2" I am passing an absolute number that is 10 (it should replace previous values by 10)
+specified date.
+- for "col_3" I am passing a binary number and it updated in dataframe on specified date.
+
+Then my desired out would look like this
+
+
+
+
+| Date |
+col_1 |
+col_2 |
+col_3 |
+
+
+
+
+| 01/01/2022 |
+81 |
+10 |
+1 |
+
+
+| 01/02/2022 |
+72 |
+10 |
+0 |
+
+
+| 01/03/2022 |
+92 |
+120 |
+0 |
+
+
+| 01/04/2022 |
+96 |
+120 |
+0 |
+
+
+| 01/05/2022 |
+99 |
+150 |
+1 |
+
+
+| 01/06/2022 |
+105 |
+155 |
+1 |
+
+
+
+
+I followed tried this code:
+def per_change(df,cols,d):
+ df[cols] = df[cols].add(df[cols].div(100).mul(pd.Series(d)), fill_value=0)
+ return df
+
+but it didn't worked out. Please help!!
","You could use dic["Date"] as a boolean mask and update values in df using the values under the other keys in dic:
+msk = df['Date'].isin(dic['Date'])
+df.loc[msk, 'col_1'] *= (1 + np.array(dic['col_1']) / 100)
+df.loc[msk, 'col_2'] = dic['col_2']
+df.loc[msk, 'col_3'] = dic['col_3']
+
+Output:
+ Date col_1 col_2 col_3
+0 01/01/2022 81.0 10 1
+1 01/02/2022 72.0 10 0
+2 01/03/2022 92.0 120 0
+3 01/04/2022 96.0 130 0
+4 01/05/2022 99.0 150 1
+5 01/06/2022 105.0 155 1
+
",python
+"How can I convert all values in a column like '€226.5M' or '€100.1K' (type object) to 226.5 or 0.1001 (type float) while working with Pandas?I have this DataFrame and I know I should use the replace method, but I don't in which way.
+I want all values in the column to be floats in million euros, so I would erase the '€', also the 'M' and if a value has a 'K' instead of an 'M', erase the K and make the number 1000 times smaller.
+Thanks!
+https://i.stack.imgur.com/lqbvU.png
","Create a custom function to convert string values to numeric:
+mappings = {'M': 1, 'K': 0.001}
+
+def to_numeric(sr):
+ df = sr.str.extract('([^€KM]+)([KM]?)')
+ return df[0].astype(float) * df[1].map(mappings).astype(float)
+
+# Convert your columns
+df['Value'] = to_numeric(df['Value'])
+df['Wage'] = to_numeric(df['Wage'])
+df['Release Clause'] = to_numeric(df['Release Clause'])
+
",python
+"How can I clean html code data in a DataFrame?
I called an api, and put it in a DataFrame.
+
There is a column with a lot of rows and the values are HTML code. I would like to clean the HTML code, only the wording itself. How can I do that?
+Example:
+<p><span style="color: red;">家庭旅遊保險計劃</span><span style="font-size: 11.5pt;color: red;">包</span><span style="font-size: 11.5pt;color: red;">括</span><span style="color: red;">申請人及配偶,及<b>免費</b>最多其四名</span><span style="color: red;">18</span><span style="color: red;">歲以下之同行子女</span><span style="color: red;"></span></p><p>每人每次最高賠償額* (港元)</p><p><b>金計劃</b><span style="font-size: 11.5pt;color: black;"><br/> </span>醫療費用 $1,000,000<span style="font-size: 11.5pt;color: black;"><br/> </span>個人意外 $1,000,000<span style="font-size: 11.5pt;color: black;"><br/> </span>手機意外損毀或遺失 $3,000</p><p><b>銅計劃</b><span style="font-size: 11.5pt;color: black;"><br/> </span>醫療費用 $250,000<span style="font-size: 11.5pt;color: black;"><br/> </span>個人意外 $250,000</p><p>*18歲以下及75歲以上之受保人於計劃內的保障額將會減少<br/></p>
+
","Firstly, run this in a terminal:
+pip install beautifulsoup4
+Afterwards, apply a proper function to the html column of your pandas dataframe (See below).
+Code:
+from bs4 import BeautifulSoup
+import pandas as pd
+
+# Create a sample dataframe
+html = '<p><span style="color: red;">家庭旅遊保險計劃</span><span style="font-size: 11.5pt;color: red;">包</span><span style="font-size: 11.5pt;color: red;">括</span><span style="color: red;">申請人及配偶,及<b>免費</b>最多其四名</span><span style="color: red;">18</span><span style="color: red;">歲以下之同行子女</span><span style="color: red;"></span></p><p>每人每次最高賠償額* (港元)</p><p><b>金計劃</b><span style="font-size: 11.5pt;color: black;"><br/> </span>醫療費用 $1,000,000<span style="font-size: 11.5pt;color: black;"><br/> </span>個人意外 $1,000,000<span style="font-size: 11.5pt;color: black;"><br/> </span>手機意外損毀或遺失 $3,000</p><p><b>銅計劃</b><span style="font-size: 11.5pt;color: black;"><br/> </span>醫療費用 $250,000<span style="font-size: 11.5pt;color: black;"><br/> </span>個人意外 $250,000</p><p>*18歲以下及75歲以上之受保人於計劃內的保障額將會減少<br/></p>'
+df = pd.DataFrame([{'html': html}])
+
+# Extract text from html
+df['extracted'] = df.html.apply(lambda s: BeautifulSoup(s).text)
+
+Output:
+
+
+
+
+ |
+html |
+extracted |
+
+
+
+
+| 0 |
+<p><span style="color: red;">家庭旅遊保險計劃</span><span style="font-size: 11.5pt;color: red;">包</span><span style="font-size: 11.5pt;color: red;">括</span><span style="color: red;">申請人及配偶,及<b>免費</b>最多其四名</span><span style="color: red;">18</span><span style="color: red;">歲以下之同行子女</span><span style="color: red;"></span></p><p>每人每次最高賠償額* (港元)</p><p><b>金計劃</b><span style="font-size: 11.5pt;color: black;"><br/> </span>醫療費用 $1,000,000<span style="font-size: 11.5pt;color: black;"><br/> </span>個人意外 $1,000,000<span style="font-size: 11.5pt;color: black;"><br/> </span>手機意外損毀或遺失 $3,000</p><p><b>銅計劃</b><span style="font-size: 11.5pt;color: black;"><br/> </span>醫療費用 $250,000<span style="font-size: 11.5pt;color: black;"><br/> </span>個人意外 $250,000</p><p>*18歲以下及75歲以上之受保人於計劃內的保障額將會減少<br/></p> |
+家庭旅遊保險計劃包括申請人及配偶,及免費最多其四名18歲以下之同行子女每人每次最高賠償額* (港元)金計劃 醫療費用 $1,000,000 個人意外 $1,000,000 手機意外損毀或遺失 $3,000銅計劃 醫療費用 $250,000 個人意外 $250,000*18歲以下及75歲以上之受保人於計劃內的保障額將會減少 |
+
+
+
+
",python
+"Why does Matplotlib shows an incorrect image?I am trying to divide an image into patches and visualize it but matplotlib keep showing totally incorrect output.
+from PIL import Image
+import os
+def imgcrop(input, xPieces, yPieces):
+ filename, file_extension = os.path.splitext(input)
+ im = Image.open(input)
+ imgwidth, imgheight = im.size
+ height = imgheight // yPieces
+ width = imgwidth // xPieces
+ for i in range(0, yPieces):
+ for j in range(0, xPieces):
+ box = (j * width, i * height, (j + 1) * width, (i + 1) * height)
+ a = im.crop(box)
+ np_img = np.asarray(a)
+ plt.imshow(np_img)
+
+I used the method as follows:
+imgcrop("cats.jpeg", 14, 14)
+
+I got a 16 x 16 patches but in different colours entirely different from the image
+code credit: #How to Split Image Into Multiple Pieces in Python
+Input:
+![]()
+Output:
+![]()
","Your issue is not that the color is wrong, but that you are only seeing the very last patch of your image being displayed (at least when run in jupyter notebook)
+This results in the only patch visible being one of the ground (lower right corner), which is completely in shades of brown and does therefore look very different to your initial picture.
+The easiest fix is to use plt.subplots to plot all patches:
+from PIL import Image
+import os
+import numpy as np
+import matplotlib.pyplot as plt
+def imgcrop(input, xPieces, yPieces):
+ filename, file_extension = os.path.splitext(input)
+ im = Image.open(input)
+ imgwidth, imgheight = im.size
+ height = imgheight // yPieces
+ width = imgwidth // xPieces
+ fig, axs = plt.subplots(yPieces, xPieces)
+ for i in range(0, yPieces):
+ for j in range(0, xPieces):
+ box = (j * width, i * height, (j + 1) * width, (i + 1) * height)
+ a = im.crop(box)
+ np_img = np.asarray(a)
+ axs[i][j].imshow(np_img)
+ [axi.set_axis_off() for axi in axs.ravel()]
+imgcrop("cat.jpg", 14, 14)
+
+Input:
+![]()
+Output:
+![]()
",python
+"How to destroy tkinter mainloop when exception occurs in any module running in daemon threadI wrote a Python (3.7) application that uses tkinter as GUI and runs the main function in a daemon thread. Is there a way to destroy the tkinter mainloop when an exception occurs in that daemon thread?
+The issue is that the program has multiple modules. Is there a way to kill the mainloop if that happens in any of them? Otherwise the user will be left with a frozen GUI.
+Here is the piece of code starting the thread:
+
+import logging
+import threading
+from tkinter import *
+from tkinter import ttk
+
+
+def main_thread():
+
+ if check_input(): # user provided necessary input
+ program_thread = threading.Thread(target=program_pipeline) # runs main
+ program_thread.daemon = True # daemon thread can be killed anytime?
+ program_thread.start()
+ block_user_entries()
+ clear_results() # from a previous run
+
+ else:
+ logging.info("\n--------------- TRY AGAIN -------------------\n")
+ ublock_user_entries()
+
+
+
+The program_pipeline communicates with multiple modules and packages
+The thread starts when user clicks a button
+
+analyze_button = ttk.Button(frame, text="Analyze", state="normal", command=main_thread)
+analyze_button.grid(column=2, row=0, pady=2, sticky=(W))
+
+root.mainloop()
+
+
+
","If I am interpreting this correctly, you are looking for a way to interrupting main thread from a daemon.
+Now, generally this is not a recommended option, you can do the same using low level _thread.interrupt_main().
+If you provide more information a better solution can be thought of.
+import _thread
+import threading
+import tkinter as tk
+
+
+def program_pipeline():
+ try:
+ # DO STUFF
+ # calling other modules that may raise exception/error
+ raise ValueError("Error")
+ except BaseException as be:
+ print("Exception in daemon", be)
+ _thread.interrupt_main()
+
+
+def main_thread():
+
+ program_thread = threading.Thread(target=program_pipeline)
+ program_thread.daemon = True
+ program_thread.start()
+
+
+root = tk.Tk()
+analyze_button = tk.Button(root, text="Analyze", state="normal", command=main_thread)
+analyze_button.grid(column=2, row=0, pady=2, sticky=tk.W)
+
+try:
+ root.mainloop()
+except KeyboardInterrupt as kie:
+ print("exception in main", kie)
+
",python
+"Looping through each cell of a column in Excel using PythonI have an Excel file with data in column A. I want to loop through each cell and stop once i reach the first cell that has a formula.
+This is my code:
+wb = openpyxl.load_workbook(r"path\filename.xlsx")
+
+sheet = wb['Sheet1']
+
+names=sheet['A']
+
+for cellObj in names:
+
+ if cellObj.value.str[0] == '=':
+ print(cellObj.value)
+
+But i get an error...
+The response from @John Giorgio worked for that part.
+Now what i'm trying to do next is to select the entire row and paste special.
+Here's my code:
+import openpyxl
+wb=openpyxl.load_workbook(path1)
+sheet = wb['Sheet1']
+
+names=sheet['A']
+
+for cellObj in names:
+ val = str(cellObj.value)
+ if val[0] == "=":
+ #print(val)
+
+ excel.Range("A10:TM10").Select() #PART I WANT TO SELECT IF TRUE
+ excel.Selection.PasteSpecial(Paste=-4163)
+
+So once I come across the first cell with '=' i want to select the entire row or at least from column A to TM and hardcode it / pasteSpecial
","wb = openpyxl.load_workbook(r"path\filename.xlsx")
+
+sheet = wb['Sheet1']
+
+names=sheet['A']
+
+for cellObj in names:
+ val = str(cellObj.value)
+ if val[0] == "=":
+ print(val)
+
",python
+"How to turn an *.RDS file into a *.FEATHER file?I am trying to covert an *.rds file in R into a *.feather file for use in Python.
+library(feather)
+data = readRDS("file.rds")
+write_feather(data,"file.feather")
+
+However, I receive the following error:
+> write_feather(data,"file.feather")
+Error: `x` must be a data frame
+
+How can I turn the *.rds file/matrix into a *.feather file to read with Pandas (or any other Pandas-compatible file that can handle a 24000*24000 matrix)?
+![]()
","Coerce matrix obeject to data.frame object:
+library(feather)
+data = readRDS("file.rds")
+as.data.frame(as.matrix(data))
+write_feather(data,"file.feather")
+
",python
+"Slash commands on nextcordI wanted to add slash commands but unsuccessfully. After hours of documentation reading, examples checking I finally left the idea to try it on my current code. So I've litterally try the example code from the nextcord documentation. I copy/paste the code add the token and the guild ID. But that wasn't more successful.
+My bot have admin permission (8 on the scope), intents are all check on the bot dev pannel and after hours of waiting... Nothing on the slash list. I thought that was a refresh problem on my discord client, so I tested run discord on several devices (pc, mac, phone, ...) but nope.
+As I said I tried the example code with only the server id changed (and also the token correctely edited, the bot himself is running well)
+import nextcord
+
+client = nextcord.Client()
+server = numberfromguildid
+
+
+@client.slash_command(guild_ids=[server]) # limits guilds with this command.
+async def ping(
+ interaction: nextcord.Interaction,
+):
+ await interaction.response.send_message("Pong!")
+
+
+client.run("TOKEN")
+
+If anyone have a solution that would be life saver !
","Did you enable applications command while generating OAuth2 link? and you still need to import modules from nextcord and nextcord.abc, Your code should look like this:
+from nextcord import Interaction, SlashOption, ChannelType
+from nextcord.abc import GuildChannel
+from nextcord.ext import commands
+import nextcord
+
+The way you initiate the client seems to be wrong also I guess
+client = commands.Bot(command_prefix='YOUR_PREFIX')
+
",python
+"How can I convert an array of dates into a Pandas DataFrame?I have a Python array in the form:
+K = [ [2022,1,16], [2022,1,18], [2022,2,12], [2022,3,24]]
+
+This array contains dates within sub-arrays.
+How can I turn it into a Pandas DataFrame with 1 column of dates in standard format (%d/%m/%Y)?
","import pandas as pd
+date_array = [ [2022,1,16], [2022,1,18], [2022,2,12], [2022,3,24]]
+date_df = pd.DataFrame(date_array, columns=['year', 'month', 'day'])
+date_df['date'] = pd.to_datetime(date_df[['year', 'month', 'day']], format='%d/%m/%Y')
+
+And if you'd like you can drop extra columns
",python
+"Problems installing vtk PythonI can't install vtk in python project with this command:
+python -m pip install vtk
+
+I'm getting this error:
+ERROR: No matching distribution found for vtk
+ERROR: Could not find a version that satisfies the requirement vtk (from versions: none)
+
+I tried to update my pip package :
+python -m pip install --upgrade pip
+
+I tried to install the .whl file directly from the project directory:
+python -m pip install .\vtk-9.1.0-cp39-cp39-win_amd64.whl
+
+I tried all vtk .whl file versions but I keep getting this error:
+ERROR: vtk-9.1.0-cp37-cp37m-win_amd64.whl is not a supported wheel on this platform.
+
+PS: I use Python 3.10.0
","It looks like a problem that was reported 3 months ago. My recommendation is using python 3.9 for vtk, until it is resolved. I verified it works fine using python 3.9.10
",python
+"Why is the scrollbar not working in the canvas (tkinter)?I want the window frame to expand the whole canvas AND have a scrollbar. Now the scrollbar is there visually but is not working as a scrollbar.
+root = Tk()
+
+def onCanvasConfigure(e):
+ my_canvas.configure(scrollregion = my_canvas.bbox("all")) #make the scrollfunction work
+ my_canvas.itemconfig('window', height=(my_canvas.winfo_height()-100), width=(my_canvas.winfo_width()-100)) #set the frame window to canvas size
+
+
+#Below code to add scrollbar to app.
+# Layers (root -> main_frame -> my_canvas -> window (frame))
+
+# Create A Main Frame
+main_frame = Frame(root)
+main_frame.pack(fill=BOTH, expand=1)
+
+# Create A Canvas
+my_canvas = Canvas(main_frame)
+my_canvas.pack(side=LEFT, fill=BOTH, expand=1)
+
+# Add A Scrollbar To The Canvas
+my_scrollbar = ttk.Scrollbar(main_frame, orient=VERTICAL, command=my_canvas.yview)
+my_scrollbar.pack(side=RIGHT, fill=Y)
+
+# Configure The Canvas
+my_canvas.configure(yscrollcommand=my_scrollbar.set)
+
+
+# Create ANOTHER Frame INSIDE the Canvas
+window = Frame(my_canvas)
+
+# Add that New frame To a Window In The Canvas
+my_canvas.create_window((0,0), window=window, anchor="nw", tags="window")
+
+my_canvas.bind("<Configure>", onCanvasConfigure)
+
+See clip: https://jumpshare.com/v/TJlbWJac5d4rwp3DwnFw
","Since you resize the internal frame window to the same size of canvas, so the scrollregion will be around the same as the size of the canvas which makes the scrollbar not activated.
+If you set the height of the frame larger than that of canvas, the scrollbar will be activated:
+def onCanvasConfigure(e):
+ # resize the frame with double height of canvas
+ my_canvas.itemconfig('window', height=e.height*2, width=e.width)
+ # update scrollregion
+ my_canvas.configure(scrollregion=my_canvas.bbox("all"))
+
",python
+"Matrix size-incompatible: In[0]: [47,1000], In[1]: [4096,256]I'm new to TensorFlow and am following a tutorial. I'm trying to do image captioning using VGG. I am getting an error that says:
+enter image description here
+This is my code:
+model = define_model(vocab_size, max_length)
+epochs = 20
+
+steps = len(train_descriptions)
+
+for i in range(epochs):
+ generator = data_generator(train_descriptions, train_features, tokenizer, max_length)
+ model.fit_generator(generator, epochs=1, steps_per_epoch=steps, verbose=1)
+ model.save('model_' + str(i) + '.h5')
+
+I'm just following the tutorial but that video was taken a long time ago. Since I'm new to this I don't understand this error. I tried model.fit() also. But, nothing works. Please, help me to rectify this.
+https://github.com/nitinkaushik01/Deep_and_Machine_Learning_Projects/blob/master/Image_Caption_Project/Image_caption_Project.ipynb - This is the tutorial I'm following.
","
+- solved after apply the below modification inputs1 --> 1000 instead
+of 4096 se1 --> 47 instead of 256 decoder2 --> 47 instead of 256
+fe2 --> 47 instead of 256 se3 --> 47 instead of 256
+- or just update inputs1 to 1000 i think it will solve the issue
+
",python
+"Python - Expanding a group using seleniumI'm new to python and I'm trying to automate a few tasks.
+My issue is when I log in to my server I need to expand the group and select the form inside the group, I'm using find_element By Xpath and ID and keep getting this error "Unable to locate element:", I tried to use sleep or WebDriverWait but didn't work.
+My element code in image now (the arrow):
+
+When I open it manual the aria-expanded changed to ="true"
+My group code:
+
+Main group
+Forms
+
+The website login details (dummy server):
+server: https://new2001.surveycto.com/
+user: taevion.ezrael@alldrys.com
+password: Stackoverflow_example
","driver.get(" https://new2001.surveycto.com/")
+time.sleep(5)
+driver.find_element(By.ID, "login-username").send_keys("taevion.ezrael@alldrys.com")
+driver.find_element(By.XPATH, "//button[text()='Next']").click()
+time.sleep(1)
+driver.find_element(By.ID, "login-password").send_keys("Stackoverflow_example")
+time.sleep(1)
+driver.find_element(By.XPATH, "//button[text()='Log in']").click()
+time.sleep(10)
+ActionChains(driver).move_to_element(driver.find_element(By.XPATH, "(//*[@data-form-field-id='foundation'])[1]")).perform()
+time.sleep(1)
+driver.find_element(By.XPATH, "(//*[@data-form-field-id='foundation'])[1]//button[@data-original-title='Add form, group or dataset']").click()
+time.sleep(1)
+driver.find_element(By.XPATH, "(//*[@data-form-field-id='foundation'])[1]//div[@class='primary-actions']//a[contains(@data-original-title, 'start from scratch')]").click()
+time.sleep(5)
+x = driver.find_element(By.XPATH, "//form[@method='post']//h3").text
+print(x)
+driver.save_screenshot('snapshot.png')
+driver.quit()
+
+This code logs in to the website with the credentials and then clicks on the + icon of the Foundation form, and after the expansion occurs, it clicks on the Start New Form button, and just to verify for you, it takes the snapshot of the screen and extracts the form header.
+Output:
+Start new form - Step 1
+
+Process finished with exit code 0
+
+Form Snapshot
",python
+"How to get input entered in QLineEdit by clicking QPushButton and then print result in QLabel?first of all, I want to completely explain what I'm going to do exactly. I have a simple script and I want to convert it to an executable file or whatever you call it but first I should start with interface stuff.
+here is the simple python script:
+for i in range(4):
+ a = False
+ while not a:
+ text = input("enter a word")
+
+ # a condition for accepting input. I mean input should satisfy this condition
+ if len(text) == 5:
+ a = True
+
+ # just print the word, simple and easy
+ print(text)
+
+I want to put this process in PyQt5 to create an extremely simple user enter face. but how should I do that? I want to add a QLineEdit to get input! input is a string-like text! and I want to trigger and enter the word by clicking QPushButton! and while the input doesn't satisfy the condition (len(text) == 5) I should be able to type another input in QLineEdit and enter it using QPushButton till the input satisfies the condition! then the program should print(text) in a QLabel(i think QLabel is okay since I can adjust its dimension and size...). and the important thing is I should be able to write and enter 4 correct words(words that satisfy the condition) in the program (since I wrote for i in range(4):). so here is what I expect step by step:
+
+- being able to enter 4 correct words
+- getting input by a
QLineEdit
+- enter input by clicking on a
QPushButton
+- if the length of the word is 5, then print it, else: wait for next input
+- do this process for 4 valid inputs
+
+I'm not much familiar with pyqt5, but I'm doing my best. here is what I wrote for this purpose:
+from PyQt5.QtWidgets import QApplication , QWidget , QPushButton , QLabel , QLineEdit
+import sys
+
+app = QApplication(sys.argv)
+
+class APP(QWidget):
+ def __init__(self):
+ super().__init__()
+
+ self.button = QPushButton("Enter" , self)
+ self.button.setGeometry(500 , 400 , 100 , 50)
+ self.button.clicked.connect(self.clicked)
+ self.Line_edit()
+
+ self.label = QLabel("place of ouput prints" , self)
+ self.label.move(40 , 60)
+ self.label.setGeometry(50 , 50 , 150 , 50)
+
+ def Line_edit(self):
+ self.input_text = QLineEdit(self)
+ self.input_text.setPlaceholderText("Enter The Word")
+
+ def clicked(self):
+ # if QLineEdit contains a word
+ if self.input_text.text():
+ for i in range(4):
+ a = False
+ while not a:
+ text = self.input_text.text()
+
+ # a condition for accepting input. I mean input should satisfy this condition
+ if len(text) == 5:
+ a = True
+
+ # just print the word in QLable, simple and easy
+ self.label = QLabel(text , self)
+ # then it should be waiting for new input untill i == 3 (since i wrote for i in range(4))
+
+
+window = APP()
+window.show()
+sys.exit(app.exec_())
+
+when I try to run this code, it doesn't even react and nothing happens. how should I modify the code to get what I am supposed to?
","I figured out the solution by myself. I was totally wrong with signal implication when you click a QPushButton! and I figured out that a QLabel can be updated using .setText()! so here is the solution:
+from PyQt5.QtWidgets import QApplication , QWidget , QPushButton , QLabel , QLineEdit
+import sys
+
+app = QApplication(sys.argv)
+
+class APP(QWidget):
+ def __init__(self):
+ super().__init__()
+
+ self.button = QPushButton("Enter" , self)
+ self.button.setGeometry(500 , 400 , 100 , 50)
+ self.button.clicked.connect(self.clicked)
+ self._submit_counter = 0
+
+ self.button2 = QPushButton("RESET" , self)
+ self.button2.setGeometry(500 , 600 , 100 , 50)
+ self.button2.clicked.connect(self.reset_clicked)
+
+ self.Line_edit()
+
+ self.label = QLabel("place of ouput prints" , self)
+ self.label.move(40 , 60)
+ self.label.setGeometry(50 , 50 , 150 , 50)
+
+ def Line_edit(self):
+ self.input_text = QLineEdit(self)
+ self.input_text.setPlaceholderText("Enter The Word")
+
+ def reset_clicked(self):
+ self._submit_counter = 0
+
+ def clicked(self):
+
+ if self._submit_counter<4:
+
+ # if QLineEdit contains a word
+ if self.input_text.text():
+
+ text = self.input_text.text()
+
+ # a condition for accepting input. I mean input should satisfy this condition
+ if len(text) == 5:
+
+ self._submit_counter += 1
+ # just print the word in QLable, simple and easy
+ self.label.setText(text)
+
+
+window = APP()
+window.show()
+sys.exit(app.exec_())
+
",python
+"Remove the black border and keep the image size same in pythonI am rotating an image 45 degree. When I rotate the image it has a black border and the image size changes from 256x256 to 364x364. I want to remove the values of this black border and keep the size 256. In pillow if I put fill color then the black border will go away but the size is still the same. Is there any way that I can remove the black border and retain the original shape
+code to rotate
+path = "E:\\download\\1.jpeg"
+image = cv2.imread(path)
+rotated = imutils.rotate_bound(image, -33)
+
+Original Image
+![]()
+Rotate image
+![]()
","you can try to crop the image as much as part of the image you want. Although the cropping in python is possible in the form of pixels. Hence, you can try to crop the image as much is required.
+The following code might help you to crop
+im=Image.open(r"specify the path of the image")`
+
+width, height = im.size
+left = "specify the value in pixels"
+top = "specify the value in pixels"
+right = "specify the value in pixels"
+bottom = "specify the value in pixels"
+# Cropped image of above dimension
+# (It will not change original image)
+im1 = im.crop((left, top, right, bottom))
+#im1.show()
+im1.save('specify destination path')
+
",python
+"Walking through python list comprehensionI'm trying to understand list comprehension in python using this example -
+async def on_member_update(before, after):
+ stream = [i for i in after.activities if str(i.type) == "ActivityType.streaming"]
+ if stream:
+
+The above example is using the api from discord.py. Streaming is a member activity - https://discordpy.readthedocs.io/en/stable/api.html#activity . ActivityType.streaming is a type - https://discordpy.readthedocs.io/en/stable/api.html#discord.ActivityType
+What's going on in this loop? I'll try and walkthrough what I may know. So if (i.type) returns as a string then it would be looping through the characters in the stream list? I'm confused. after.activites is the member's CURRENT activity. So it'd be streaming. What exactly does (i.type) represent what how is the loop interacting with after.activites?
+Getting lost. Could someone walk me through the steps of what's happening here? Thank you!
","List comprehensions can be mechanically transformed into loops. This comprehension is equivalent to the code
+stream = []
+for i in after.activities:
+ if str(i.type) == "ActivityType.streaming":
+ stream.append(i)
+
+
+Unrelated, it is somewhat atypical to compare the type of something by first converting it to a string and then comparing the resulting string. Knowing nothing else, I would expect to see instead the line
+ if i.type == ActivityType.streaming
+
+or perhaps a isinstance() instead of direct type comparison if there can be meaningful subtypes.
",python
+"Python pandas sample without mixing indexI want to apply the sample function from Pandas independently for each value of the index for a data frame. This can be done with a for loop like this:
+import pandas
+
+df = pandas.DataFrame({'something': [3,4,2,2,6,7], 'n': [1,1,2,2,3,3]})
+df.set_index(['n'], inplace=True)
+
+resampled_as_I_want_df = df[0:0]
+for i in sorted(set(df.index)):
+ resampled_as_I_want_df = resampled_as_I_want_df.append(
+ df.loc[i].sample(frac=1, replace=True),
+ )
+
+print(resampled_as_I_want_df)
+
+Let me explain this in a human-friendly way. The df data frame looks like this:
+ something
+n
+1 3
+1 4
+2 2
+2 2
+3 6
+3 7
+
+Now we see that there are three "index groups" which have the values 1, 2 and 3. What I want to do is to apply the sample function in a way that the new data frame will have the same index, without random sampling, and the sampling is performed within each group as if they were independent data frames.
+Is there a way to avoid the for loop? For large data frames it is a bottle neck.
","Use df.groupby(level=0).sample(frac=1, replace=True).
",python
+"How to extract date from a string?I have this weird string:
+'": "1899-12-30 14:50:00.000"": " "'
+
+I need to just extract the date.
+I have looked at all the different python string manipulation functions but I just can't seem to find one that works for this interesting format.
+Thanks
","Based on your string if you just want 1899-12-30 you could do:
+'": "1899-12-30 14:50:00.000"": " "'.split(' ')[1][1:]
+
+if you want the full 1899-12-30 14:50:00.000 you could do
+'": "1899-12-30 14:50:00.000"": " "'.split('"')[2]
+
+Explanation:
+taking the string we are splitting the string by its characters in the first example a space and in the second a double quote as those characters surround the date element. The split function creates a list in which we access the element that we would like in the first case the second element (0 index list) while the second list we grab the third element. For the first Example printing out the output before we do a slice of the sting would give an extra double quote before the date, therefore we take the first element off of the string to only get the date.
",python
+"How can I get the symbolic gradient [Tensorflow 2.x]I want to get the symbolic expression for gradient estimation. When I see the output it's quite difficult to understand what's going on.
+import tensorflow as tf
+@tf.function
+def f_k(input_dat):
+ y = tf.matmul(tf.sin(input_dat[0]), input_dat[1])
+ grads = tf.gradients([y], input_dat)
+ # grads = tape.gradient([y], input_dat)
+ tf.print('tf >>', grads)
+ print('print >>', grads)
+ return y, grads
+
+
+a = tf.Variable([[1., 3.0], [2., 6.0]])
+b = tf.Variable([[1.], [2.]])
+input_data = [a, b]
+y, z = f_k(input_data)
+print(y, z)
+
+Output: inside the function
+print >> [<tf.Tensor 'gradients/Sin_grad/mul:0' shape=(2, 2) dtype=float32>, <tf.Tensor 'gradients/MatMul_grad/MatMul_1:0' shape=(2, 1) dtype=float32>]
+tf >> [[[0.540302277 -1.979985]
+ [-0.416146845 1.92034054]], [[1.75076842]
+ [-0.138295487]]
+
+As the output, I want which is shown with print:
+[<tf.Tensor 'gradients/Sin_grad/mul:0' shape=(2, 2) dtype=float32>, <tf.Tensor 'gradients/MatMul_grad/MatMul_1:0' shape=(2, 1) dtype=float32>]
+
+However, the function always returns the numerical result. Could someone help me to get this symbolic representation of the gradient?
","The symbolic representation you want will only work in graph mode. Outside of graph mode, eager execution is enabled by default. What you can do is create a new function to print the values and wrap it with the @tf.function decorator like you are already doing for f_k:
+import tensorflow as tf
+
+@tf.function
+def f_k(input_dat):
+ y = tf.matmul(tf.sin(input_dat[0]), input_dat[1])
+ grads = tf.gradients([y], input_dat)
+ # grads = tape.gradient([y], input_dat)
+ tf.print('tf >>', grads)
+ print('print >>', grads)
+ return y, grads
+
+a = tf.Variable([[1., 3.0], [2., 6.0]])
+b = tf.Variable([[1.], [2.]])
+input_data = [a, b]
+y, z = f_k(input_data)
+
+@tf.function
+def print_symbolic(y, z):
+ print(y,z)
+ return y, z
+y, z = print_symbolic(y, z)
+
+print >> [<tf.Tensor 'gradients/Sin_grad/mul:0' shape=(2, 2) dtype=float32>, <tf.Tensor 'gradients/MatMul_grad/MatMul_1:0' shape=(2, 1) dtype=float32>]
+tf >> [[[0.540302277 -1.979985]
+ [-0.416146845 1.92034054]], [[1.75076842]
+ [-0.138295487]]]
+Tensor("y:0", shape=(2, 1), dtype=float32) [<tf.Tensor 'z:0' shape=(2, 2) dtype=float32>, <tf.Tensor 'z_1:0' shape=(2, 1) dtype=float32>]
+
+You could also just access the tensors of your graph:
+graph = f_k.get_concrete_function(input_data).graph
+print(*[tensor for op in graph.get_operations() for tensor in op.values()], sep="\n")
+
+Tensor("input_dat:0", shape=(), dtype=resource)
+Tensor("input_dat_1:0", shape=(), dtype=resource)
+Tensor("Sin/ReadVariableOp:0", shape=(2, 2), dtype=float32)
+Tensor("Sin:0", shape=(2, 2), dtype=float32)
+Tensor("MatMul/ReadVariableOp:0", shape=(2, 1), dtype=float32)
+Tensor("MatMul:0", shape=(2, 1), dtype=float32)
+Tensor("gradients/Shape:0", shape=(2,), dtype=int32)
+Tensor("gradients/grad_ys_0/Const:0", shape=(), dtype=float32)
+Tensor("gradients/grad_ys_0:0", shape=(2, 1), dtype=float32)
+Tensor("gradients/MatMul_grad/MatMul:0", shape=(2, 2), dtype=float32)
+Tensor("gradients/MatMul_grad/MatMul_1:0", shape=(2, 1), dtype=float32)
+Tensor("gradients/Sin_grad/Cos:0", shape=(2, 2), dtype=float32)
+Tensor("gradients/Sin_grad/mul:0", shape=(2, 2), dtype=float32)
+Tensor("StringFormat:0", shape=(), dtype=string)
+Tensor("Identity:0", shape=(2, 1), dtype=float32)
+Tensor("Identity_1:0", shape=(2, 2), dtype=float32)
+Tensor("Identity_2:0", shape=(2, 1), dtype=float32)
+
+Check the docs for more information.
",python
+"Replacing all functions with 'pass', syncing private to public github repoI'd like to create a repository for a proprietary python module, similarly how this python package mlfinlabs. They have emptied out all functions like this:
+ def bet_size_dynamic(current_pos, max_pos, market_price, forecast_price, cal_divergence=10, cal_bet_size=0.95,
+ func='sigmoid'):
+ pass
+
+
+ def bet_size_budget(events_t1, sides):
+ pass
+
+I found the libcst module, that parses the source code and you can do operations on it.
+Is there a better practice of doing this? eg.: github actions? I can't find any other good solution.
","
+Github action to sync from private to public repo.
+Using this github action: https://github.com/marketplace/actions/github-repo-sync
+
+libcst (project site) to strip functions afterwards (also as a github action step). Below is the codemod that you should put either
+
+
+
+- in the libcst folder (
\Lib\site-packages\libcst\codemod\commands)
+- in your repo directory and then specify it
.libcst.codemod.yaml (this is needed if you run the codemod on github actions):
+
+# String that LibCST should look for in code which indicates that the
+# module is generated code.
+
+generated_code_marker: '@generated'
+
+# Command line and arguments for invoking a code formatter. Anything
+# specified here must be capable of taking code via stdin and returning
+# formatted code via stdout.
+
+formatter: ['black', '-']
+
+# List of regex patterns which LibCST will evaluate against filenames to
+# determine if the module should be touched.
+
+blacklist_patterns: ['.*replace_functions\.py']
+
+# List of modules that contain codemods inside of them.
+
+modules:
+
+- 'libcst.codemod.commands'
+- 'mycodemod' # THIS IS THE NAME OF THE FOLDER
+
+# Absolute or relative path of the repository root, used for providing
+# full-repo metadata. Relative paths should be specified with this file
+# location as the base.
+
+repo_root: '.'
+
+put your codemod then under:
+project_root
+|--- mycodemod
+ |--- __init__.py (this is an empty file)
+ |--- replace_functions.py (the codemod pasted below)
+
+In replace_functions.py put this snippet:
+from ast import Expression, literal_eval
+from typing import Union
+
+import libcst as cst
+from libcst.codemod import CodemodContext, VisitorBasedCodemodCommand
+from libcst.codemod.visitors import AddImportsVisitor
+
+
+class ReplaceFunctionCommand(VisitorBasedCodemodCommand):
+ # Add a description so that future codemodders can see what this does.
+ DESCRIPTION: str = "Replaces the body of a function with pass."
+
+ def __init__(self, context: CodemodContext) -> None:
+ # Initialize the base class with context, and save our args.
+ print("this happens")
+ super().__init__(context)
+
+
+ def leave_FunctionDef(self, original_node: cst.FunctionDef, updated_node: cst.FunctionDef) -> cst.FunctionDef:
+ replace_function = cst.FunctionDef(
+ name=updated_node.name,
+ params=cst.Parameters(),
+ body= cst.SimpleStatementSuite((cst.Pass(),)),
+ returns=None)
+
+ return replace_function
+
",python
+"Pytorch - Problem with fine tune training from custom features and classesThe core of my problem is the fact that my features come from NumPy files (.npy).
+Therefore I need the following class in my code
+import torch
+import torchvision
+import torchvision.transforms as transforms
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+import numpy as np
+from torch.utils.data import Dataset, DataLoader
+from torchvision.models import resnet50
+import time
+import copy
+
+class MyDataSet(torch.utils.data.Dataset):
+ def __init__(self, x, y, transform=None):
+ super(MyDataSet, self).__init__()
+ # store the raw tensors
+ self._x = np.load(x)
+ self._y = np.load(y)
+ self.transform = transform
+
+ def __len__(self):
+ # a DataSet must know it size
+ return self._x.shape[0]
+
+ def __getitem__(self, index):
+ x = self._x[index, :]
+ y = self._y[index, :]
+ return x, y
+
+To convert my NumPy files to DataLoaders I do the following. The code below seems to work (at least, no errors are returned)
+#Transform dataset
+transform = transforms.Compose([transforms.ToTensor()])
+dataset = MyDataSet("train1-features.npy","train1-classes.npy",transform=transform)
+dataloader = DataLoader(dataset, batch_size=32)
+
+I am trying to fine-tune a RESNET-50 network in these data with 12 classes. Here is what I do
+def set_parameter_requires_grad(model, feature_extracting):
+ if feature_extracting:
+ for param in model.parameters():
+ param.requires_grad = False
+
+feature_extract = True
+batch_size = 8
+num_epochs = 15
+num_classes=12
+
+model_ft = resnet50(pretrained=True)
+set_parameter_requires_grad(model_ft, feature_extract)
+num_ftrs = model_ft.fc.in_features
+model_ft.fc = nn.Linear(num_ftrs, num_classes)
+input_size = 224
+
+if torch.cuda.is_available():
+ model_ft.cuda()
+
+params_to_update = model_ft.parameters()
+
+print("Params to learn:")
+if feature_extract:
+ params_to_update = []
+ for name,param in model_ft.named_parameters():
+ if param.requires_grad == True:
+ params_to_update.append(param)
+ print("\t",name)
+else:
+ for name,param in model_ft.named_parameters():
+ if param.requires_grad == True:
+ print("\t",name)
+
+# Observe that all parameters are being optimized
+optimizer_ft = optim.SGD(params_to_update, lr=0.001, momentum=0.9)
+
+# Setup the loss fxn
+criterion = nn.CrossEntropyLoss()
+
+Finally, here is the problematic training function
+for epoch in range(num_epochs): # loop over the dataset multiple times
+
+ running_loss = 0.0
+ for i, data in enumerate(dataloader, 0):
+
+ # get the inputs; data is a list of [inputs, labels]
+ inputs, labels = data
+
+ #transfer labels and inputs to cuda()
+ inputs,labels=inputs.cuda(), labels.cuda()
+
+ # zero the parameter gradients
+ optimizer.zero_grad()
+
+ # forward + backward + optimize
+ outputs = model_ft(inputs)
+ loss = loss_func(outputs, labels)
+ loss.backward()
+ optimizer.step()
+
+ # print statistics
+ running_loss += loss.item()
+ if i % 2000 == 1999: # print every 2000 mini-batches
+ print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
+ running_loss = 0.0
+
+This returns me the following error once I execute the code:
+Traceback (most recent call last):
+ File "train_my_data_example.py", line 89, in <module>
+ for i, data in enumerate(dataloader, 0):
+ File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 517, in __next__
+ data = self._next_data()
+ File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 557, in _next_data
+ data = self._dataset_fetcher.fetch(index) # may raise StopIteration
+ File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
+ data = [self.dataset[idx] for idx in possibly_batched_index]
+ File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
+ data = [self.dataset[idx] for idx in possibly_batched_index]
+ File "train_my_data_example.py", line 29, in __getitem__
+ y = self._y[index, :]
+IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
+
+The error is clearly the dataloader variable, so is this creation ok? I mean, I am loading NumPy data and transforming it to a data loader as below:
+transform = transforms.Compose([transforms.ToTensor()])
+dataset = MyDataSet("train1-features.npy","train1-classes.npy",transform=transform)
+dataloader = DataLoader(dataset, batch_size=32)
+
+Is there any error in my data loader or is the problem the training loop of Pytorch?
+P.s: you can reproduce my code by downloading the classes and features here
","You are trying to index the second axis of an array which only has a single dimension. Simply replace y = self._y[index, :] with y = self._y[index].
+Actually when positioned last, : is not required as all dimensions are selected by default.
",python
+"Github action deploy-cloud-functions not building in dependencies?Yesterday, I was trying to restructure some Github CI/CD, because the actions from Google were throwing warnings about deprecated usages.
+One of the steps is the (build and) deployment of a GCP function.
+The repository of the function to be deployed was structured like this:
+my_proj
+ |- .github
+ |- src
+ |- my_proj
+ |- __init__.py
+ |- main.py
+ |- requirements.txt
+...
+
+,with the requirements.txt holding
+boto3==1.16.54
+
+The important bit here is the requirements.txt, that holds some dependencies, that I need to ship as well.
+Before, I had to build the package uploaded to GCP myself, but with the "deploy-cloud-functions" action this seemed to be obsolete now. I set up the actions in Github according to documentation:
+steps:
+ - name: Login to GCP
+ uses: google-github-actions/auth@v0
+ with:
+ credentials_json: ...
+
+ - name: Deploy GCP Function image
+ uses: google-github-actions/deploy-cloud-functions@v0
+ with:
+ name: my_function_name
+ runtime: python37
+ project_id: ...
+ source_dir: ./src/my_proj
+ env_vars:
+ ...
+
+Now, the deployment worked. However, when inspecting the function now in GCP or downloading it, none of the dependencies were contained there and the logs upon triggering the function similarly showed a function crash due to missing dependencies.
+I also tried to move the requirements.txt file to the project root, but apparently to no avail. I was not very lucky in finding extensive documentation about the work with GCP functions from within Github beyond the above linked Google-owned action repository.
+Can anyone spot my error here?
","While deploying to Cloud Functions using github actions all the dependencies also get uploaded. But, as already mentioned by Danyel Cabello, you won’t be able to see the dependencies in the source tab of the Cloud Functions in Google Cloud Console.
+To see the build logs you can search for resource.type=“build” in the Cloud Logging of Google Cloud Console.
",python
+"Filter by CharField pretending it is DateField in Django ORM/mySqlI am working with a already-done mySQL Database using Django ORM and I need to filter rows by date - if it wasn't that dates are not in Date type but normal Varchar(20) stored as dd/mm/yyyy hh:mm(:ss).
+With a free query I would transform the field into date and I would use > and < operators to filter the results but before doing this I was wondering whether Django ORM provides a more elegant way to do so without writing raw SQL queries.
+I look forward to any suggestion.
+EDIT: my raw query would look like
+SELECT * FROM table WHERE STR_TO_DATE(mydate,'%d/%m/%Y %H:%i') > STR_TO_DATE('30/12/2020 00:00', '%d/%m/%Y %H:%i')
+
+Thank you.
","I will assume your model looks like this:
+from django.db import models
+
+class Event(models.Model):
+ mydate = models.CharField(max_length=20)
+
+ def __str__(self):
+ return f'Event at {self.mydate}'
+
+You can construct a Django query expression to represent this computation. This expression consists of:
+
+Func objects representing your STR_TO_DATE function calls.
+- An
F object representing your field name.
+- A
GreaterThan function to represent your > comparison.
+
+from django.db.models import F, Func, Value
+from django.db.models.lookups import GreaterThan
+from .models import Event
+
+# Create some events for this example
+Event(mydate="29/12/2020 00:00").save()
+Event(mydate="30/12/2020 00:00").save()
+Event(mydate="31/12/2020 00:00").save()
+
+class STR_TO_DATE(Func):
+ "Lets us use the STR_TO_DATE() function from SQLite directly in Python"
+ function = 'STR_TO_DATE'
+
+# This Django query expression converts the mydate field
+# from a string into a date, using the STR_TO_DATE function.
+mydate = STR_TO_DATE(F('mydate'), Value('%d/%m/%Y %H:%i'))
+
+# This Django query expression represents the value 30/12/2020
+# as a date.
+date_30_12_2020 = STR_TO_DATE(Value('30/12/2020 00:00'), Value('%d/%m/%Y %H:%i'))
+
+# This Django query expression puts the other two together,
+# creating a query like this: mydate < 30/12/2020
+expr = GreaterThan(mydate, date_30_12_2020)
+
+# Use the expression as a filter
+events = Event.objects.filter(expr)
+print(events)
+
+# You can also use the annotate function to add a calculated
+# column to your query...
+events_with_date = Event.objects.annotate(date=mydate)
+
+# Then, you just treat your calculated column like any other
+# field in your database. This example uses a range filter
+# (see https://docs.djangoproject.com/en/4.0/ref/models/querysets/#range)
+events = events_with_date.filter(date__range=["2020-12-30", "2020-12-31"])
+print(events)
+
+I tested this answer with Django 4.0.1 and MySQL 8.0.
",python
+"How to set default python3 to python 3.9 instead of python 3.8 in Ubuntu 20.04 LTSI have installed Python 3.9 in the Ubuntu 20.04 LTS. Now the system has both Python 3.8 and Python 3.9.
+# which python
+# which python3
+/usr/bin/python3
+# which python3.8
+/usr/bin/python3.8
+# which python3.9
+/usr/bin/python3.9
+# ls -alith /usr/bin/python3
+12583916 lrwxrwxrwx 1 root root 9 Jul 19 2021 /usr/bin/python3 -> python3.8
+
+But the pip3 command will still install everything into the Python 3.8 directory.
+# pip3 install --upgrade --find-links file:///path/to/directory <...>
+
+I want to change that default pip3 behavior by updating the symbolic link /usr/bin/python3 to /usr/bin/python3.9.
+How to do that?
+# update-alternatives --set python3 /usr/bin/python3.9
+This command will not work as expected.
+
+Here is the pip3 info:
+# which pip3
+/usr/bin/pip3
+# ls -alith /usr/bin/pip3
+12589712 -rwxr-xr-x 1 root root 367 Jul 13 2021 /usr/bin/pip3
+# pip3 -V
+pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.8)
+#
+
+The alias command will not work:
+# alias python3=python3.9
+# ls -alith /usr/bin/python3
+12583916 lrwxrwxrwx 1 root root 9 Jul 19 2021 /usr/bin/python3 -> python3.8
+
","You should be able to use python3.9 -m pip install <package> to run pip with a specific python version, in this case 3.9.
+The full docs on this are here: https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/
+If you want python3 to point to python3.9 you could use the quick and dirty.
+alias python3=python3.9
+
+EDIT:
+Tried to recreate your problem,
+# which python3
+/usr/bin/python3
+# python3 --version
+Python 3.8.10
+# which python3.8
+/usr/bin/python3.8
+# which python3.9
+/usr/bin/python3.9
+
+Then update the alternatives, and set new priority:
+# sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1
+# sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 2
+# sudo update-alternatives --config python3
+There are 2 choices for the alternative python3 (providing /usr/bin/python3).
+
+ Selection Path Priority Status
+------------------------------------------------------------
+ 0 /usr/bin/python3.9 2 auto mode
+ 1 /usr/bin/python3.8 2 manual mode
+* 2 /usr/bin/python3.9 2 manual mode
+
+Press <enter> to keep the current choice[*], or type selection number: 0
+
+Check new version:
+# ls -alith /usr/bin/python3
+3338 lrwxrwxrwx 1 root root 25 Feb 8 14:33 /usr/bin/python3 -> /etc/alternatives/python3
+# python3 -V
+Python 3.9.5
+# ls -alith /usr/bin/pip3
+48482 -rwxr-xr-x 1 root root 367 Jul 13 2021 /usr/bin/pip3
+# pip3 -V
+pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.9)
+
+Hope this helps (tried it in wsl2 Ubuntu 20.04 LTS)
",python
+"How to get nodes credentials using jenkinsapiI need to get credentials information (which credentials used) for node.
+![]()
+Currently I use this code that prints LOT of information but no credentials info that used for the node:
+for node in get_server_instance().nodes._data['computer']:
+ for i in node:
+ print (i, node[i])
+
+Is there any way to reach credentials ?
+Thanks
",The only way is run grep "credentials" */config.xml on the server inside nodes directory
,python
+"come back to tree view when user click on save button in odoo 13.0I override the write function in my model for calling my function that is manually set changing in DB, actually I delete this record from table in this function after that I want comeback to tree view automatically and I don't have the changed record anymore to return
+because of that I stuck in write function and it doesn't finished by the way without calling my function I still can't comeback to tree view with return action(I tried any form of return action it didn't work at all) :
+def write(self, vals):
+ self.changing_status(vals)
+ action = {
+ 'name': _('Cash Control'),
+ 'view_mode': 'tree',
+ 'view_type': 'form',
+ 'res_model': 'wfwodoovitemsstatuscurrent',
+ 'view_id': self.env.ref('nmdi_workflow.list').id,
+ 'type': 'ir.actions.act_window',
+ 'target': 'current'
+}
+return action
+
","I solved this by adding a button on the tree view and adding a wizard instead of form view.so, I considered a transient model for this wizard and then added the default values using a function then I added a button with the save name, then I manually made the changes to the table and set an action to return for come back to tree view and update that again.
+class WfwU1001W1(models.TransientModel):
+ _name = 'wfwu1001wizard1'
+
+ field1 = fields.Integer( default=lambda self: self._get_data('field1'))
+ field2 = fields.Integer( default=lambda self: self._get_data('field2'))
+
+ @api.model
+ def _get_data(self, field_name):
+ id = self.env.context.get("active_id")
+ if id:
+ return self.env['wfwodoovitemsstatuscurrent'].browse(id).mapped(field_name)[0]
+
+
+ def save_changing_action(self):
+# do some logic and save manually
+
+ return {
+ 'name': 'closewizard',
+ 'view_type': 'tree',
+ 'view_mode': 'tree',
+ 'res_model': 'wfwodoovitemsstatuscurrent',
+ 'view_id': False,
+ 'views': [(self.env.ref('nmdi_workflow.list').id, 'tree')],
+ 'type': 'ir.actions.act_window'
+ }
+
",python
+"Call MongoDB stored function from pymongoI have the following function stored in mongodb:
+db.system.js.save({
+ _id: "testFunc",
+ value: function() {
+ return db.clients.find_one({}, {"_id" : False});
+ }
+});
+
+How do I call the function above from python using pymongo?
+I have already tried db.eval('testFunc') and it doesn't work. The method is deprecated.
","eval, system_js etc. are all deprecated.
+While you can still create javascript functions on the server, it comes with a clear warning:
+
+Do not store application logic in the database. There are performance
+limitations to running JavaScript inside of MongoDB. Application code
+also is typically most effective when it shares version control with
+the application itself.
+
",python
+"Why does float() not ""unstring"" a string formatted list element?I need to parse a comma delimited string that is part of a dict into values. I receive the data originally as a (huge) JSON formatted string and am loading it into a dict with json.loads().
+Some of the values split from the string will always be floats and I just want to float() those, while others can be either strings, null, or empty and will need to be treated separately (not the topic of this question).
+Bizarrely, after splitting the string, the resulting list appears to contain some non-float()-able version of a string.
+Consider this small python3 example code
+# construct the sample dict
+var = {}
+var['abc'] = '"123.456","zzz"'
+
+# JSONify it
+var_json = json.dumps(var)
+
+print("var_json: %s" % json.loads(var_json))
+
+# var_json now exemplifies the input data
+# deJSONify it:
+data_in = json.loads(var_json)
+
+a = data_in['abc'].split(",")
+print(a[0])
+print("this works:", float("123.456"))
+print("this borks:", float(a[0]))
+
+This results in the following output:
+var_json: {'abc': '"123.456","456"'}
+"123.456"
+this works: 123.456
+Traceback (most recent call last):
+ File "./test.py", line 26, in <module>
+ print("this borks:", float(a[0]))
+ValueError: could not convert string to float: '"123.456"'
+
+So: Clearly, to python the value in the list resulting from the split is a string (has double quotes around it in the output). But using float() on that string doesn't work.
+Changing that last line to manually replace the quotes works:
+print(float(a[0].replace("\"", "")))
+
+So it looks like a[0] is in fact a string containing double quotes.
+The same error occurs even without the json.dumps/loads roundtrip, e.g. just accessing the split list from the dict directly:
+print("This also borks: ", float(var['abc'].split(",")[0]))
+
+Why does float() not "unstring" what very clearly is a string and a valid float conversion input? How can I avoid that .replace() call?
","It's not enough to just split on commas; you also need to remove the literal quotation marks from the content of the string.
+Quotation marks are not numbers. Thus, a string that contains quotation marks as part of the data within it is not a string that contains only a number. Just as the Python string 'a123a' cannot be parsed as a number, neither can '"123"': The "s are just as out-of-place in the second example as the as are in the first.
+For example, you could use:
+float(a[0].replace('"', ''))
+
+
+Insofar as your JSON document is encapsulating CSV data, you can use the Python csv module to parse it in a way that will remove those quotes:
+data_in = {'abc': '"123.456","zzz"'}
+a = csv.reader([data_in['abc']]).__next__()
+print("this now works:", float(a[0]))
+
",python
+"""return"" with Ternary and warlus operatorsWhy when I try to return value with warlus and ternary operators := like this:
+def get_index(elements: List[int], i: int, boundary: int) -> tp.Optional[int]:
+ return x := elements[i] if elements[i] > boundary else None
+
+I get an error.
","The comments suggest that you can't use the walrus operator in a return statement, that's incorrect - you just have a syntax error.
+This works, but is pointless, as also pointed out in the comments:
+from typing import List, Optional
+
+
+def get_index(elements: List[int], i: int, boundary: int) -> Optional[int]:
+ return (x := elements[i] if elements[i] > boundary else None)
+
+
+print(get_index([1, 2, 3], 2, 1))
+
+All you need is parentheses around the walrus assignment and the value of the expression will be the assigned value.
+But why assign to x if all you do is return that value. Instead:
+from typing import List, Optional
+
+
+def get_index(elements: List[int], i: int, boundary: int) -> Optional[int]:
+ return elements[i] if elements[i] > boundary else None
+
+
+print(get_index([1, 2, 3], 2, 1))
+
",python
+"How to send file from Nodejs to Flask Python?Hope you are doing well.
+I'm trying to send pdfs file from Nodejs to Flask using Axios.
+I read files from a directory (in the form of buffer array) and add them into formData (an npm package) and send an Axios request.
+ const existingFile = fs.readFileSync(path)
+ console.log(existingFile)
+ const formData = new nodeFormData()
+ formData.append("file", existingFile)
+ formData.append("fileName", documentData.docuName)
+ try {
+ const getFile = await axios.post("http://127.0.0.1:5000/pdf-slicer", formData,
+ {
+ headers: {
+ ...formData.getHeaders()
+ }
+ })
+ console.log(getFile)} catch (e) {console.log(e, "getFileError")}
+
+On flask side:
+I'm trying to get data from the request.
+ print(request.files)
+
+ if (request.method == "POST"):
+ file=request.form["file"]
+ if file:
+ print(file)
+
+in request.file, I'm getting ImmutableMultiDict([])
+but in request.form["file"], I'm getting data something like this:
+![]()
+how can I handle this type of file format or how can I convert this file format to python fileObject.
","I solved this issue by updating my Nodejs code.
+We need to convert formData file into octet/stream format.
+so I did minor change in my formData code :
+before: formData.append("file", existingFile)
+after: formData.append("file", fs.createReadStream(existingFile)
+
+Note: fs.createReadStream only accepts string or uint8array
+without null bytes. we cannot pass the buffer array.
+
",python
+"Matplotlib colorbar: __init__() got an unexpected keyword argument 'location'I was trying to plot a matplotlib colorbar to the left of my axis following the example given here: https://matplotlib.org/stable/gallery/axes_grid1/simple_colorbar.html#sphx-glr-gallery-axes-grid1-simple-colorbar-py
+But as I wanted to have the colorbar on the left side of the axis, I tried:
+import matplotlib.pyplot as plt
+import numpy as np
+from mpl_toolkits.axes_grid1 import make_axes_locatable
+
+fig, ax = plt.subplots(1, 1)
+
+im = plt.imshow(np.arange(0, 100).reshape(10, 10))
+ax.set_xticklabels([])
+ax.set_yticklabels([])
+
+divider = make_axes_locatable(ax)
+cax = divider.append_axes("left", size="5%", pad=0.05)
+colorbar = fig.colorbar(im, cax=cax, location='left')
+colorbar.set_label('y label')
+
+which gives me the following exception:
+---------------------------------------------------------------------------
+TypeError Traceback (most recent call last)
+<ipython-input-6-89d8edf2c11c> in <module>
+ 11 divider = make_axes_locatable(ax)
+ 12 cax = divider.append_axes("left", size = "5%", pad = 0.05)
+---> 13 colorbar = fig.colorbar(im, cax = cax, location = 'left')
+ 14 colorbar.set_label('y label')
+ 15
+
+~\Anaconda3\envs\data-evaluation\lib\site-packages\matplotlib\figure.py in colorbar(self, mappable, cax, ax, use_gridspec, **kw)
+ 1171 'panchor']
+ 1172 cb_kw = {k: v for k, v in kw.items() if k not in NON_COLORBAR_KEYS}
+-> 1173 cb = cbar.Colorbar(cax, mappable, **cb_kw)
+ 1174
+ 1175 self.sca(current_ax)
+
+~\Anaconda3\envs\data-evaluation\lib\site-packages\matplotlib\colorbar.py in __init__(self, ax, mappable, **kwargs)
+ 1195 if isinstance(mappable, martist.Artist):
+ 1196 _add_disjoint_kwargs(kwargs, alpha=mappable.get_alpha())
+-> 1197 super().__init__(ax, **kwargs)
+ 1198
+ 1199 mappable.colorbar = self
+
+~\Anaconda3\envs\data-evaluation\lib\site-packages\matplotlib\_api\deprecation.py in wrapper(*args, **kwargs)
+ 469 "parameter will become keyword-only %(removal)s.",
+ 470 name=name, obj_type=f"parameter of {func.__name__}()")
+--> 471 return func(*args, **kwargs)
+ 472
+ 473 return wrapper
+
+TypeError: __init__() got an unexpected keyword argument 'location'
+
+Why is that? If I use:
+colorbar = fig.colorbar(im, ax=ax, location='left')
+colorbar.set_label('y label')
+
+instead of divider, it seems to work, but the padding between the axis and colorbar is not as small as I want it.
","The example you cite does not use a location argument to colorbar:
+colorbar = fig.colorbar(im, cax=cax)
+
+That's because you are asking it to plot the colorbar in the new axes, which you already placed on the left with the statement cax = divider.append_axes("left", size="5%", pad=0.05). You can see that an empty axis is being generated for the colorbar before the error is thrown.
+Keep in mind that doing it this way will invert the placement of your colorbar ticks and label:
+![]()
+You can avoid this by adding the following after the colorbar has been drawn (since the colorbar will undo the change):
+cax.yaxis.tick_left()
+cax.yaxis.set_label_position('left')
+
+![]()
",python
+"I´m trying to implement: np.maximum.outer in Python 3x but I´m getting this error: NotImplementedErrorI have a matrix like this:
+RCA = pd.DataFrame(
+ data=[
+ (1,0,0,0),
+ (1,1,1,0),
+ (0,0,1,0),
+ (0,1,0,1),
+ (1,0,1,0)],
+ columns=['ct1','ct2','ct3','ct4'],
+ index=['ind_1','ind_2','ind_3','ind_4','ind_5'])
+
+I´m trying to calculate:
+norms = RCA.sum()
+norm = np.maximum.outer(norms, norms)
+
+And I´m getting this error:
+---------------------------------------------------------------------------
+NotImplementedError Traceback (most recent call last)
+<ipython-input-9-4fd04a55ad8c> in <module>
+ 4
+ 5 norms = RCA.sum()
+----> 6 norm = np.maximum.outer(norms, norms)
+ 7 proximity = RCA.T.dot(RCA).div(norm)
+ 8
+
+~/opt/anaconda3/envs/py37/lib/python3.7/site-packages/pandas/core/series.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
+ 746 return None
+ 747 else:
+--> 748 return construct_return(result)
+ 749
+ 750 def __array__(self, dtype=None) -> np.ndarray:
+
+~/opt/anaconda3/envs/py37/lib/python3.7/site-packages/pandas/core/series.py in construct_return(result)
+ 735 if method == "outer":
+ 736 # GH#27198
+--> 737 raise NotImplementedError
+ 738 return result
+ 739 return self._constructor(result, index=index, name=name, copy=False)
+
+NotImplementedError:
+
+This works perfect in Python 2.7, but I need to run it in Python 3.x
+I need to find a way around this issue. Thanks a lot.
","In [181]: RCA
+Out[181]:
+ ct1 ct2 ct3 ct4
+ind_1 1 0 0 0
+ind_2 1 1 1 0
+ind_3 0 0 1 0
+ind_4 0 1 0 1
+ind_5 1 0 1 0
+In [182]: norms = RCA.sum()
+In [183]: norms
+Out[183]:
+ct1 3
+ct2 2
+ct3 3
+ct4 1
+dtype: int64
+In [184]: np.maximum.outer(norms,norms)
+Traceback (most recent call last):
+ File "<ipython-input-184-d24a173874f6>", line 1, in <module>
+ np.maximum.outer(norms,norms)
+ File "/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py", line 2032, in __array_ufunc__
+ return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)
+ File "/usr/local/lib/python3.8/dist-packages/pandas/core/arraylike.py", line 381, in array_ufunc
+ result = reconstruct(result)
+ File "/usr/local/lib/python3.8/dist-packages/pandas/core/arraylike.py", line 334, in reconstruct
+ raise NotImplementedError
+NotImplementedError
+
+Sometimes passing a dataframe (or Series) to a numpy function works ok, but apparently here we need to explicitly use the array values:
+In [185]: norms.values
+Out[185]: array([3, 2, 3, 1])
+In [186]: np.maximum.outer(norms.values,norms.values)
+Out[186]:
+array([[3, 3, 3, 3],
+ [3, 2, 3, 2],
+ [3, 3, 3, 3],
+ [3, 2, 3, 1]])
+
+Actually looking at the traceback, apparently pandas adapts ufunc to its own uses. np.maximum(norms,norms) works, but apparently pandas has not adapted the outer method. [186] is pure numpy, returning an array.
+Plain np.maximum returns a Series:
+In [192]: np.maximum(norms,norms)
+Out[192]:
+ct1 3
+ct2 2
+ct3 3
+ct4 1
+dtype: int64
+
+outer returns a 2d array, which in pandas terms would be a dataframe, not a Series. That could explain why pandas does not implement outer.
",python
+"How to stop my rect from leaving a trail when moving over my background image
+I am making a pong game and the ball object leaves a trail when it moves and I was wondering how to stop it from doing this any help would be greatly appreciated. Thank you in advance.
+Here is my code:
+
+import sys
+import pygame
+pygame.init()
+clock = pygame.time.Clock()
+# screen setup
+screen_width = 1200
+screen_height = 700
+screen = pygame.display.set_mode((screen_width, screen_height))
+background = pygame.image.load("spacebackground.png").convert_alpha()
+pygame.mouse.set_visible(False)
+pygame.display.set_caption('pong')
+
+
+# set up player and opponent and ball rect
+player = pygame.Rect(screen_width - 20,screen_height/2 - 70,10,140)
+opponent = pygame.Rect(10,screen_height/2 - 70,10,140 )
+ball = pygame.Rect(screen_width/2-15, screen_height/2 - 15,30, 30)
+
+ball_speed_x = 7
+ball_speed_y = 7
+# defining clock and colour
+red = (0,0,0)
+
+clock = pygame.time.Clock()
+
+while True:
+
+ pygame.display.update(ball)
+ pygame.display.flip()
+ screen.blit(background, (0, 0))
+ for event in pygame.event.get():
+ if event.type == pygame.QUIT:
+ pygame.quit()
+ sys.exit()
+
+ ball.x += ball_speed_x
+ ball.y += ball_speed_y
+ # Drawing the rects
+ pygame.draw.rect(background, red, player)
+ pygame.draw.rect(background, red, opponent)
+ pygame.draw.ellipse(background, red, ball)
+ pygame.display.update(ball)
+
+ clock.tick(60)
+
+
+
+
","Draw the objects on the screen instead of on the background:
+run = True
+while run:
+ for event in pygame.event.get():
+ if event.type == pygame.QUIT:
+ run = False
+
+ ball.x += ball_speed_x
+ ball.y += ball_speed_y
+
+ # draw background on the screen
+ screen.blit(background, (0, 0))
+
+ # draw objects on the screen
+ pygame.draw.rect(screen, red, player)
+ pygame.draw.rect(screen, red, opponent)
+ pygame.draw.ellipse(screen, red, ball)
+
+ # update the display
+ pygame.display.flip()
+
+ clock.tick(60)
+
+pygame.quit()
+sys.exit()
+
",python
+"Removing duplicates from the list of pydantic objectsI tried to remove duplicates from the list of pydantic objects, but faced a problem that I could not solve. The only working method is very slow.
+Is there a faster way to remove duplicates than my method?
+Code:
+Pydantic model (a.py)
+from pydantic import BaseModel
+
+
+class Photo(BaseModel):
+ title: str
+ url: str
+
+Main file (b.py)
+from collections import OrderedDict
+from a import Photo
+
+# 3 objects, 2 duplicates
+a_obj = {
+ 'title': 'SOME TITLE v1',
+ 'url': 'http://some.url'
+}
+b_obj = {
+ 'title': 'SOME TITLE v2',
+ 'url': 'http://different.url'
+}
+c_obj = {
+ 'title': 'SOME TITLE v1',
+ 'url': 'http://some.url'
+}
+
+# Creating list of pydantic objects
+pd_obj_list = list()
+pd_obj_list += [Photo(**a_obj)]
+pd_obj_list += [Photo(**b_obj)]
+pd_obj_list += [Photo(**c_obj)]
+
+# My Attempts to Remove Duplicates
+
+# Using OrderedDict.fromkeys
+final_list_0 = list(OrderedDict.fromkeys(pd_obj_list))
+# returns TypeError: unhashable type: 'Photo'
+
+# Using Set
+final_list_1 = list(set(pd_obj_list))
+# returns TypeError: unhashable type: 'Photo'
+
+# Using enumerate
+final_list_2 = [i for n, i in enumerate(pd_obj_list) if i not in pd_obj_list[:n]]
+# It works but too slow when I have ~10k objects in the list
+
","Use:
+pd_obj_list = [Photo(**a_obj), Photo(**b_obj), Photo(**c_obj)]
+final_list_0 = list(OrderedDict(((photo.title, photo.url), photo) for photo in pd_obj_list).values())
+print(final_list_0)
+
+Output
+[Photo(title='SOME TITLE v1', url='http://some.url'), Photo(title='SOME TITLE v2', url='http://different.url')]
+
+If Photo is inmutable you could define __hash__ as follows:
+from collections import OrderedDict
+
+from pydantic import BaseModel
+
+
+class Photo(BaseModel):
+ title: str
+ url: str
+
+ def __hash__(self):
+ return hash((self.title, self.url))
+
+
+# 3 objects, 2 duplicates
+a_obj = {
+ 'title': 'SOME TITLE v1',
+ 'url': 'http://some.url'
+}
+b_obj = {
+ 'title': 'SOME TITLE v2',
+ 'url': 'http://different.url'
+}
+c_obj = {
+ 'title': 'SOME TITLE v1',
+ 'url': 'http://some.url'
+}
+
+pd_obj_list = [Photo(**a_obj), Photo(**b_obj), Photo(**c_obj)]
+final_list_0 = list(OrderedDict.fromkeys(pd_obj_list))
+print(final_list_0)
+
+Output
+[Photo(title='SOME TITLE v1', url='http://some.url'), Photo(title='SOME TITLE v2', url='http://different.url')]
+
",python
+"How to set hidden_layer_sizes in sklearn MLPRegressor using optuna trialI would like to use [OPTUNA][1] with sklearn [MLPRegressor][1] model.
+For almost all hyperparameters it is quite straightforward how to set OPTUNA for them.
+For example, to set the learning rate:
+learning_rate_init = trial.suggest_float('learning_rate_init ',0.0001, 0.1001, step=0.005)
+My problem is how to set it for hidden_layer_sizes since it is a tuple. So let's say I would like to have two hidden layers where the first will have 100 neurons and the second will have 50 neurons. Without OPTUNA I would do:
+MLPRegressor( hidden_layer_sizes =(100,50))
+But what if I want OPTUNA to try different neurons in each layer? e.g., from 100 to 500, how can I set it? the MLPRegressor expects a tuple
","You could set up your objective function as follows:
+import optuna
+import warnings
+from sklearn.datasets import make_regression
+from sklearn.model_selection import train_test_split
+from sklearn.neural_network import MLPRegressor
+from sklearn.metrics import mean_squared_error
+warnings.filterwarnings('ignore')
+
+X, y = make_regression(random_state=1)
+
+X_train, X_valid, y_train, y_valid = train_test_split(X, y, random_state=1)
+
+def objective(trial):
+
+ params = {
+ 'learning_rate_init': trial.suggest_float('learning_rate_init ', 0.0001, 0.1, step=0.005),
+ 'first_layer_neurons': trial.suggest_int('first_layer_neurons', 10, 100, step=10),
+ 'second_layer_neurons': trial.suggest_int('second_layer_neurons', 10, 100, step=10),
+ 'activation': trial.suggest_categorical('activation', ['identity', 'tanh', 'relu']),
+ }
+
+ model = MLPRegressor(
+ hidden_layer_sizes=(params['first_layer_neurons'], params['second_layer_neurons']),
+ learning_rate_init=params['learning_rate_init'],
+ activation=params['activation'],
+ random_state=1,
+ max_iter=100
+ )
+
+ model.fit(X_train, y_train)
+
+ return mean_squared_error(y_valid, model.predict(X_valid), squared=False)
+
+study = optuna.create_study(direction='minimize')
+study.optimize(objective, n_trials=3)
+# [I 2021-11-11 18:04:02,216] A new study created in memory with name: no-name-14c92e38-b8cd-4b8d-8a95-77158d996f20
+# [I 2021-11-11 18:04:02,283] Trial 0 finished with value: 161.8347337123744 and parameters: {'learning_rate_init ': 0.0651, 'first_layer_neurons': 20, 'second_layer_neurons': 40, 'activation': 'tanh'}. Best is trial 0 with value: 161.8347337123744.
+# [I 2021-11-11 18:04:02,368] Trial 1 finished with value: 159.55535852658082 and parameters: {'learning_rate_init ': 0.0551, 'first_layer_neurons': 90, 'second_layer_neurons': 70, 'activation': 'relu'}. Best is trial 1 with value: 159.55535852658082.
+# [I 2021-11-11 18:04:02,440] Trial 2 finished with value: 161.73980822730888 and parameters: {'learning_rate_init ': 0.0051, 'first_layer_neurons': 100, 'second_layer_neurons': 30, 'activation': 'identity'}. Best is trial 1 with value: 159.55535852658082.
+
",python
+"Merge should adopt Nan values if not existing valueHowever, I have the following problem:
+
+If a year or date does not exist in df2 then a price and a listing_id is automatically added during the merge. But that should be NaN
+
+The second problem is when merging, as soon as I have multiple data that were on the same day and year then the temperature is also merged to the second, for example:
+
+
+d = {'id': [1], 'day': [1], 'temperature': [20], 'year': [2001]}
+df = pd.DataFrame(data=d)
+print(df)
+
+ id day temperature year
+0 1 1 20 2001
+
+d2 = {'id': [122, 244], 'day': [1, 1],
+ 'listing_id': [2, 4], 'price': [20, 440], 'year': [2001, 2001]}
+df2 = pd.DataFrame(data=d2)
+print(df2)
+
+ id day listing_id price year
+0 122 1 2 20 2001
+1 244 1 4 440 2001
+
+df3 = pd.merge(df,df2[['day', 'listing_id', 'price']],
+ left_on='day', right_on = 'day',how='left')
+print(df3)
+
+ id day temperature year listing_id price
+0 1 1 20 2001 2 20
+1 1 1 20 2001 4 440 # <-- The second temperature is wrong :/
+
+This should not be so, because if I later still have a date from year 2002 which was in day 1 with a temperature of 30 and I want to calculate the average. Then I get the following formula: 20 + 20 + 30 = 23.3. The formula should be 20 + 30 = 25. Therefore, if a value has already been filled, there should be a NaN value in it.
+Code Snippet
+d = {'id': [1, 2, 3, 4, 5], 'day': [1, 2, 3, 4, 2],
+ 'temperature': [20, 40, 50, 60, 20], 'year': [2001, 2002, 2004, 2005, 1999]}
+df = pd.DataFrame(data=d)
+print(df)
+
+ id day temperature year
+0 1 1 20 2001
+1 2 2 40 2002
+2 3 3 50 2004
+3 4 4 60 2005
+4 5 2 20 1999
+
+
+d2 = {'id': [122, 244, 387, 4454, 521], 'day': [1, 2, 3, 4, 2],
+ 'listing_id': [2, 4, 5, 6, 7], 'price': [20, 440, 500, 6600, 500],
+ 'year': [2001, 2002, 2004, 2005, 2005]}
+df2 = pd.DataFrame(data=d2)
+print(df2)
+
+ id day listing_id price year
+0 122 1 2 20 2001
+1 244 2 4 440 2002
+2 387 3 5 500 2004
+3 4454 4 6 6600 2005
+4 521 2 7 500 2005
+
+
+df3 = pd.merge(df,df2[['day','listing_id', 'price']],
+ left_on='day', right_on = 'day',how='left').drop('day',axis=1)
+print(df3)
+
+ id day temperature year listing_id price
+0 1 1 20 2001 2 20
+1 2 2 40 2002 4 440
+2 2 2 40 2002 7 500
+3 3 3 50 2004 5 500
+4 4 4 60 2005 6 6600
+5 5 2 20 1999 4 440
+6 5 2 20 1999 7 500
+
+
+
+What I want
+ id day temperature year listing_id price
+0 1 1 20 2001 2 20
+1 2 2 40 2002 4 440
+2 2 2 NaN 2005 7 500
+3 3 3 50 2004 5 500
+4 4 4 60 2005 6 6600
+5 5 2 20 1999 NaN NaN
+
","IIUC:
+>>> df1.merge(df2[['day', 'listing_id', 'price', 'year']],
+ on=['day', 'year'], how='outer')
+
+ id day temperature year listing_id price
+0 1.0 1 20.0 2001 2.0 20.0
+1 2.0 2 40.0 2002 4.0 440.0
+2 3.0 3 50.0 2004 5.0 500.0
+3 4.0 4 60.0 2005 6.0 6600.0
+4 5.0 2 20.0 1999 NaN NaN
+5 NaN 2 NaN 2005 7.0 500.0
+
",python
+"Button Not Adding Updef login_button():
+ wrong_value_count = 0
+ username = username_entry.get()
+ password = password_entry.get()
+ if username == user1_username and password == user1_password:
+ redirecting_text = canvas.create_text(40, 90, text="Credentials Match, Please Wait While "
+ "We Redirect You To Your Vault", fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text(redirecting_text))
+ print("ESHTA")
+ elif username == user2_username and password == user2_password:
+ redirecting_text2 = canvas.create_text(40, 90, text="Credentials Match, Please Wait While "
+ "We Redirect You To Your Vault", fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text3(redirecting_text2))
+ print("ESHTA")
+ else:
+ wrong_value_count += 1
+ if wrong_value_count <= 3:
+ print("NOT ESHTA")
+ wrong_credentials_text = canvas.create_text(40, 90, text="Wrong Credentials, Try again",
+ fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text2(wrong_credentials_text))
+ elif wrong_value_count > 3:
+ lock_text = canvas.create_text(40, 90, text="Sorry, You've Reached the Max Number of Trials"
+ " Please Try Again Later", fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text4(lock_text))
+ print(wrong_value_count)
+
+Alright, so I'm trying to add to the wrong value count, but every time I click the button, it shows that it's still 1. How do make it add up till it reaches 3 as written?
","wrong_value_count must be outside the button because when login_button is called it resets the count to 0, which you don't want.
+wrong_value_count = 0
+def login_button():
+ global wrong_value_count
+ username = username_entry.get()
+ password = password_entry.get()
+ if username == user1_username and password == user1_password:
+ redirecting_text = canvas.create_text(40, 90, text="Credentials Match, Please Wait While "
+ "We Redirect You To Your Vault", fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text(redirecting_text))
+ print("ESHTA")
+ elif username == user2_username and password == user2_password:
+ redirecting_text2 = canvas.create_text(40, 90, text="Credentials Match, Please Wait While "
+ "We Redirect You To Your Vault", fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text3(redirecting_text2))
+ print("ESHTA")
+ else:
+ wrong_value_count += 1
+ if wrong_value_count <= 3:
+ print("NOT ESHTA")
+ wrong_credentials_text = canvas.create_text(40, 90, text="Wrong Credentials, Try again",
+ fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text2(wrong_credentials_text))
+ elif wrong_value_count > 3:
+ lock_text = canvas.create_text(40, 90, text="Sorry, You've Reached the Max Number of Trials"
+ " Please Try Again Later", fill="red", anchor=NW)
+ canvas.after(2500, lambda: delete_text4(lock_text))
+ print(wrong_value_count)
+
",python
+"python request to get html page is missing the article element contentsim trying to record data from this website using python , but the data that i get is missing the "dashboard" data ,
+the article element containing it comes back empty ,
+like this :
+<article id="dashboard"></article>
+
+im making a request like this :
+import requests
+page = requests.get(url)
+
+how can i get the data in the website consistently and quickly ?
","What you're seeing here is a javscript+web-socket driven web-app.
+When you connect to the page javascript starts a websocket connection to server is established that constantly sends the data to the javascript client. Then the client unpacks it to html content that you see.
+If you fire up a web inspector you can see this web socket connection open:
+![]()
+Now how do we replicate this in a scraper?
+We need a websocket client and send those green messages to start receiving the data you want. For example websocket-client package we can do:
+from websocket import create_connection
+ws = create_connection("wss://bitcoin.clarkmoody.com/dashboard/ws")
+
+# replicate the green messages
+ws.send("""{"op":"c","ch":"","pl":{"c":"4de43be4236035c5","s":"9f6e08f07c263998"}}""")
+ws.send("""{"op":"sub","ch":"mod"}""")
+ws.send("""{"op":"sub","ch":"sta"}""")
+ws.send("""{"op":"sub","ch":"sys"}""")
+ws.send("""{"op":"sub","ch":"upd"}""")
+
+# then you can start receiving the data
+
+while True:
+ print(ws.recv())
+
+Now it's up to you to figure out the rest of reverse-engineering. For the initial messages it seems to be some sort of subscription (op: sub, ch:upd probably means operation subscribe channel UPD). Either way the above script should output this as first response message then continue spewing price adjustments:
+{"op":"dat","ch":"mod","pl":[{"rows":[{"slug":"p-row","cells":[{"type":"label","slug":"p-label","quiet":true,"label":"Price"},{"type":"price","slug":"p","def":"Market price of Bitcoin","sep":true,"unit":"$","prefix":true,"places":2}]},{"slug":"sd-row","cells":[{"type":"label","slug":"sd-label","quiet":true,"label":"Sats per Dollar"},{"type":"integer","slug":"sd","def":"Value of one US Dollar, expressed in Satoshis","sep":true,"places":0}]},{"slug":"c-row","cells":[{"type":"label","slug":"c-label","quiet":true,"label":"Market Capitalization"},{"type":"price","slug":"c","def":"Product of market price times total mined supply","sep":true,"unit":"$","prefix":true,"places":2}]}],"slug":"markets","name":"Markets","order":10,"help":"Bitcoin spot price and futures information","feature":false,"headless":false},{"rows":[{"slug":"links-row","noInfo":true,"cells":<..TOO LONG FOR SO..>
+
",python
+"pygame - rects collision detection issueI'm learning pygame and I have an issue with detection of rect's collisions. I'm using colliderect() function now but it works only when rects overlap, and the question is "How to detect even edges collisions?". General comments about the whole are welcome. First post btw.
+Here is my code:
+import pygame
+
+
+#####SETTINGS#####
+HEIGHT = 1080
+WIDTH = 1920
+BLOCK_SIZE = 60
+##################
+
+
+class Level():
+ def __init__(self, file):
+ self.file = file
+ self.blocks = []
+ self.map = self.load_from_file()
+
+ def load_from_file(self):
+ map = []
+ file = open(self.file + '.txt', 'r')
+ data = file.read()
+ file.close()
+ data = data.split('\n')
+ for x in data:
+ map.append(list(x))
+ return map
+
+ def render(self, screen):
+ self.blocks = []
+ y = 0
+ for row in self.map:
+ x = 0
+ for block in row:
+ if block != '0':
+ self.blocks.append(pygame.Rect(x * BLOCK_SIZE, y * BLOCK_SIZE, BLOCK_SIZE, BLOCK_SIZE))
+ if block == '1':
+ pygame.draw.rect(screen, (56,24,0), (x * BLOCK_SIZE, y * BLOCK_SIZE, BLOCK_SIZE, BLOCK_SIZE))
+ elif block == '2':
+ pygame.draw.rect(screen, (18,115,81), (x * BLOCK_SIZE, y * BLOCK_SIZE, BLOCK_SIZE, BLOCK_SIZE))
+ x += 1
+ y += 1
+
+
+class Player():
+ def __init__(self, x, y, color):
+ self.x = x
+ self.y = y
+ self.color = color
+ self.rect = pygame.Rect(x, y, BLOCK_SIZE, BLOCK_SIZE)
+ self.go_left = False
+ self.go_right = False
+ self.go_up = False
+ self.go_down = False
+ self.collisions = {'left' : False, 'right' : False, 'top' : False, 'bottom' : False}
+
+ def move(self):
+ self.collisions = test_collisions(self.rect, level.blocks)
+ if self.go_left and not self.collisions['left']:
+ self.x -= 10
+ self.go_left = False
+ if self.go_right and not self.collisions['right']:
+ self.x += 10
+ self.go_right = False
+ if self.go_up and not self.collisions['top']:
+ self.y -= 10
+ self.go_up = False
+ if self.go_down and not self.collisions['bottom']:
+ self.y += 10
+ self.go_down = False
+ self.rect = pygame.Rect(self.x, self.y, BLOCK_SIZE, BLOCK_SIZE)
+
+ def render(self, screen):
+ pygame.draw.rect(screen, self.color, self.rect)
+
+
+def render(screen, player, level):
+ screen.fill((49, 113, 181))
+ level.render(screen)
+ player.render(screen)
+ pygame.display.update()
+
+
+def handle_events(player):
+ for event in pygame.event.get():
+ if event.type == pygame.QUIT or event.type == pygame.KEYDOWN and event.key == pygame.K_ESCAPE:
+ run = False
+ pygame.quit()
+ keys = pygame.key.get_pressed()
+ if keys[pygame.K_a]:
+ player.go_left = True
+ if keys[pygame.K_d]:
+ player.go_right = True
+ if keys[pygame.K_w]:
+ player.go_up = True
+ if keys[pygame.K_s]:
+ player.go_down = True
+
+
+def test_collisions(object, rects):
+ collisions = {'left' : False, 'right' : False, 'top' : False, 'bottom' : False}
+ for rect in rects:
+ if object.colliderect(rect):
+ if object.x <= rect.x:
+ collisions['right'] = True
+ if object.x >= rect.x:
+ collisions['left'] = True
+ if object.y >= rect.y:
+ collisions['top'] = True
+ if object.y <= rect.y:
+ collisions['bottom'] = True
+ return(collisions)
+
+
+def main_loop():
+ clock = pygame.time.Clock()
+ while run:
+ clock.tick(60)
+ handle_events(player)
+ player.move()
+ render(screen, player, level)
+
+
+if __name__ == "__main__":
+ run = True
+ screen = pygame.display.set_mode((WIDTH, HEIGHT))
+ level = Level('assets/level_one')
+ player = Player(0,0,(255,255,0))
+ main_loop()
+
+And here is level_one.txt file content:
+00000000000000000000000000000000
+00000000000000000000000000000000
+00000000000000000000000000000000
+00000000000000000000000000000000
+00000000000000000000000000000000
+00000000000000000000000000000000
+00000000000000002222222222200000
+00000000000000000000000000000000
+00000000000000000000000000000000
+00022222222200000000000000000000
+00000000000000000000000000000000
+00000000000000000000000000000000
+22222222222222222222222222222222
+11111111111111111111111111111111
+11111111111111111111111111111111
+11111111111111111111111111111111
+11111111111111111111111111111111
+11111111111111111111111111111111
+
","You can enlarge the rectangles for the collision detection with [pygame.Rect.inflate`](builtins.TypeError: argument 1 must be pygame.Surface, not function). e.g.:
+if object.colliderect(rect.inflate(1, 1)):
+ # [...]
+
",python
+"concat lines from several .txt file using python?Im using python 3.6 on windows 10, i have a task to take data from 1.txt and concat them with some string and put the results on 2.txt file .
+this is my code:
+full_url = "https://mysite/images/pic_person/small/"
+
+#file read from
+pic_name = open("test.txt","r")
+#file write to it
+full_name = open("full_name.txt","a")
+while True:
+ line = pic_name.readline()
+ link = line+full_url
+ print(link)
+ full_name.write(link)
+ if ("" == line):
+ print("file finished")
+ break;
+pic_name.close()
+full_name.close()
+
+after executing the code it gives me this result:
+
+p100003.jpg
+https://mysite/images/pic_person/small/p100026.jpg
+https://mysite/images/pic_person/small/p100951.jpg
+https://mysite/images/pic_person/small/p100970.jpg
+https://mysite/images/pic_person/small/p101144.jpghttps://mysite/images/pic_person/small/https://mysite/images/pic_person/small/
+
+
+and except results will be like this :
+
+https://mysite/images/pic_person/small/p100026.jpg
+https://mysite/images/pic_person/small/p100951.jpg
+https://mysite/images/pic_person/small/p100970.jpg
+https://mysite/images/pic_person/small/p101144.jpg
+
+
+the file test.txt contains these lines :
+p100026.jpg
+p100951.jpg
+p100970.jpg
+p101144.jpg
+
","to append file on the bottom you need to use 'a' flag in the open file and put "\n" as enter
+full_url = "https://mysite/images/pic_person/small/"
+
+# file read from test.txt
+f = open("test.txt", "r")
+
+# iterating each file on pict and write it
+with open("text2.txt", "a") as file_object:
+ for item in f:
+ file_object.write(full_url+item.splitlines()[0]+"\n")
+
+f.close()
+
",python
+"PyQt6: app shows permanetly Gnome taskbarRunning a simple PyQt6 app Gnome desktop environment shows the taskbar permanently:
+![]()
+OS information:
+
+Distributor ID: Kali
+
+Description: Kali GNU/Linux
+
+Rolling Release: 2021.3
+
+Codename: kali-rolling
+
+Code:
+import sys
+from PyQt6.QtWidgets import QApplication, QWidget
+
+
+def main():
+
+ app = QApplication(sys.argv)
+
+ w = QWidget()
+ w.resize(250, 200)
+ w.move(300, 300)
+
+ w.setWindowTitle('Simple')
+ w.show()
+
+ sys.exit(app.exec())
+
+
+if __name__ == '__main__':
+ main()
+
+How can this behaviour be prevented?
","I observed this is normal behaviour on any application when the window is not maximized. When I maximize, the taskbar disappears.
",python
+"Pandas: Determine if columns are matchedI'm trying to eliminate all rows that match in col0 and col1, but don't have a pair of -1, 1 between rows (for example in the dataframe below there isn't a a2, b1, -1 row). I was trying to come up with someway to do this, but was groupby and getting multiindex and not getting anywhere...
+# no a2, b1, -1
+df = pd.DataFrame([
+ ['a1', 'b1', -1, 0/1],
+ ['a1', 'b1', 1, 1/1],
+ ['a1', 'b2', -1, 2/1],
+ ['a1', 'b2', 1, 1/2],
+ ['a2', 'b1', 1, 1/3],
+ ['a2', 'b2', -1, 2/3],
+ ['a2', 'b2', 1, 4/1]
+], columns=['col0', 'col1', 'col2', 'val'])
+
+# desired output
+# a1, b1, -1, 0.0
+# a1, b1, 1, 1.0
+# a1, b2, -1, 2.0
+# a1, b2, 1, 0.5
+# a2, b2, -1, 0.66667
+# a2, b2, 1, 4.0
+
","We can use groupby filter to test if there are at least 1 (any) of each value (-1 and 1) per group with Series.any:
+result_df = df.groupby(['col0', 'col1']).filter(
+ lambda x: x['col2'].eq(-1).any() and x['col2'].eq(1).any()
+)
+
+result_df:
+ col0 col1 col2 val
+0 a1 b1 -1 0.000000
+1 a1 b1 1 1.000000
+2 a1 b2 -1 2.000000
+3 a1 b2 1 0.500000
+5 a2 b2 -1 0.666667
+6 a2 b2 1 4.000000
+
",python
+"How to generate a line plot for data in column on condition of data in 2 other columnsIn my df I have 4 columns. The 1st two describe test conditions. Columns 3 & 4 are test data. Column 3 data is my fixed x axis data. I want a line overlay plot with legend for my Column 4 INL data, y axis data, for conditions based in Column 1 & 2.
+The below code shows an example df. I can Group test_no and temperature columns to get min/max data from the INL data for the values of the group data that I want. I can't use the same group set-up to plot INL data for test_no=0,1 & Temperature=25,50. How can I achieve this? Thanks
+import pandas as pd
+
+import matplotlib.pyplot as plt
+
+data = {'Test_no': [0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1], 'Temperature': [25, 25, 25, 25, 50, 50, 50, 50, 25, 25, 25, 25, 50, 50, 50, 50], 'Codes': [0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3,], 'INL':[0,1.1,-0.9,0, 0,1.0,-0.8,0, 0,0.9,-0.7,0, 0,1.2,-0.6,0]}
+
+df = pd.DataFrame(data)
+
+groups = df.groupby(['Test_no','Temperature'])
+
+#group data and get min,max,mean per Col1/2 group
+
+result = groups.agg({'INL': ['min', 'max', 'mean']})
+print(result)
+
+#Plot data per Col1/2 group (start with C1:0,C2:25)
+
+d = df.loc[df["Temperature"] == 25]
+
+d = df.loc[df["Test_no"] == 0]
+
+d['INL'].plot()
+
+
","I am relatively new to python but I was able to resolve my issue using the groupby() function to split the data into the groups that I required and then use matplotlib.pyplot to index and plot each group of data.
+I'm sure there are more efficient solutions but this works!
+Using the same DF as I used in the question I added the below code:
+Tests = [g for _, g in df.groupby(['Test_no','Temperature'])]
+
+plt.plot(Tests[0].Codes, Tests[0].INL, label='T0 25C')
+plt.plot(Tests[1].Codes, Tests[1].INL, label='T0 50C')
+plt.plot(Tests[2].Codes, Tests[2].INL, label='T1 25C')
+plt.plot(Tests[3].Codes, Tests[3].INL, label='T1 50C')
+
+plt.xlabel("Codes")
+plt.ylabel("INL")
+plt.title("INL Data")
+plt.legend()
+plt.show()
+
+
+Or you can use a more compact code to plot:
+for (n, temp), Test in df.groupby(['Test_no','Temperature']):
+ plt.plot(Test.Codes, Test.INL, label=f"T{n} {temp}C")
+
",python
+"Why does python's SharedMemory seem to initialize arrays to zerosI'm initializing a SharedMemory in python to be shared between multiple processes and I've noticed that it always seems to be filled with zeros (which is fine), but I don't understand why this is occurring as the documentation doesn't state there is a default value to fill the memory with.
+This is my test code, opened in two seperate power shells, shell 1:
+import numpy as np
+from multiprocessing.shared_memory import SharedMemory
+def get_array_nbytes(rows, cols, dtype):
+ array = np.zeros((rows, cols), dtype=dtype)
+ nbytes = array.nbytes
+ del array
+ return nbytes
+
+rows = 10000000
+depths_columns = 18
+array_sm = SharedMemory(create=True, size=get_array_nbytes(rows, depths_columns, np.float32), name='array_sm')
+
+shell 2:
+from multiprocessing.shared_memory import SharedMemory
+import numpy as np
+array_sm = SharedMemory("depths_array")
+array = np.ndarray((rows, 18), dtype=np.float32, buffer=array_sm.buf)
+
+now in the second shell you can follow this up with:
+array[0]
+array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
+ 0.], dtype=float32)
+
+or
+np.where(array != 0)
+(array([], dtype=int64), array([], dtype=int64))
+
+Is this behavior always going to be the case or is this a fluke? Is there some sort of undocumented initialization to zero happening in the background?
","This is operating system dependent. Python doesn't initialize the memory - it just takes the virtual memory address offered by the operating system. On posix systems it uses shm_open, while on Windows its CreateFileMapping. On linux and windows, these calls guarantee that memory is initialized to zero.
+It would be a security leak to let the application see whatever left over data happens to be in the RAM from the previous user, so it needs to be filled with something. But this isn't a guarantee from python and its possible that some operating systems (embedded OS perhaps) don't do things that way.
",python
+"How to pass cards to another player and so on. Say P1 draws 4 of clubs, how do I pass to P2I'm assuming I would want to put it under the Player class where I've started, but no matter what I've done I don't understand how to go about it. I'd prefer to be pointed in the right direction as opposed to just giving me code.
+import random
+
+class Card(object):
+ def __init__(self, suit, val):
+ self.suit = suit
+ self.value = val
+
+ def show(self):
+ print("{} of {}".format(self.value, self.suit))
+
+class Deck(object):
+ def __init__(self):
+ self.cards = []
+ self.build()
+#suits of the cards
+ def build(self):
+ for s in ["Spades", "Clubs", "Diamonds", "Hearts"]:
+ for v in range(1, 14):
+ self.cards.append(Card(s, v))
+
+ def show(self):
+ for c in self.cards:
+ c.show()
+
+#shuffle the deck
+ def shuffle(self):
+ for i in range(len(self.cards)-1, 0, -1):
+ r = random.randint(0, i)
+ self.cards[i], self.cards[r] = self.cards[r], self.cards[i]
+
+ def draw(self):
+ return self.cards.pop()
+
+class Player(object):
+
+ def __init__(self):
+ self.hand = []
+
+#draw cards
+ def draw(self, deck):
+ self.hand.append(deck.draw())
+ return self
+
+#show the cards in the players hand
+ def showHand(self):
+ for card in self.hand:
+ card.show()
+
+#Pass card to next player
+ def passCard(self):
+ self.hand.pop()
+
+
+deck = Deck()
+deck.shuffle()
+
+bob = Player()
+bob.draw(deck).draw(deck).draw(deck).draw(deck)
+bob.showHand()
+
","I would have two methods in the player class:
+# In Python, prefer snake_case over camelCase
+def give_card(self, index):
+ return self.hand.pop(index)
+
+def take_card(self, card):
+ self.hand.append(card)
+
+And then anytime you want to pass one card from one player to another, you would do something like:
+player_2.take_card(player_1.give_card(0))
+
",python
+"numpy array with inequality sign?mask = np.stack([robject_mask, rshadow_mask, mask], axis=1).astype('float')
+
+ mask[mask >= 128] = 255
+ mask[mask < 128] = 0
+
+i'm practicing making attention network with pytorch, and i have a question in the code. i think it's about normalizing images to [-1,1], but i cannot understand what it exactly means. anyone know what is this operation's name? what does it mean? or maybe giving me a keyword that i can search?
","You're question is not clear, but I'll try explaining what's going on.
+mask >= 128
+
+Will output an array of the form of [True, True, False, ...] each true correspond to an index where mask had a value greater than 127.
+Then mask[mask >= 128] means any index of mask where the inner array had true in it. Same goes for < 128 just the opposite.
+In the end of the proccess, mask will contain either 255 or 0 in each index, depending if the value prior to it was greater than 128 or not.
",python
+"Pandas: Most computationally efficient way to combine consecutive rows with conditionsSay I have dataframe like this
+df = pd.DataFrame({
+'position': ['head', 'tail', 'head', 'head', 'head', 'tail', 'tail', 'head'],
+ 'start': [2, 13, 54, 320, 654, 677, 3430, 9000],
+ 'end': [4, 15, 564, 390, 674, 679, 6000, 9010],
+ }) #s. e. k k. s. e. k
+df.head(10)
+
+
+ position start end
+0 head 2 4
+1 tail 13 15
+2 head 54 564
+3 head 320 390
+4 head 654 674
+5 tail 677 679
+6 tail 3430 6000
+7 head 9000 9010
+
+I want to combine rows such that if the position label if the row is 'head' and then consecutive position is 'tail', then those rows should be combined in such a way that the 'start' value from 'head' is used, and 'end' value of 'tail' is used. And there are multiple consecutive 'tail' rows that follow a 'head' row, then the middle 'tail' rows will be skipped.
+It's tricky to explain, but here's an example dataframe of what the desired result should look like
+ position start end
+0 tail 2 15
+1 head 54 564
+2 head 320 390
+3 tail 654 6000
+4 head 9000 9010
+
+I came up with this solution using iterrows
+previous = None
+list_dicts = []
+for idx, row in df.iterrows():
+ if row['position'] == 'head':
+ if previous:
+ package = {'position': previous, 'start':previous_start, 'end':previous_end}
+ list_dicts.append(package)
+ previous = 'head'
+ previous_start = row['start']
+ previous_end = row['end']
+ elif row['position'] == 'tail':
+ previous = 'tail'
+ previous_start = previous_start
+ previous_end = row['end']
+if row['position'] == 'head':
+ package = {'position': row['position'], 'start':row['start'], 'end':row['end']}
+elif row['position'] == 'tail':
+ package = {'position': row['position'], 'start':previous_start, 'end':row['end']}
+list_dicts.append(package)
+
+pd.DataFrame(list_dicts).head(10)
+
+But I have read that iterrows should be avoided because it's not the most computationally efficient way to manipulate dataframes. And in this case, I am resorting to creating a brand new dataframe. But in the case of using conditionals based on consecutive rows, it's the only solution I can think of.
","One way using pandas.groupby:
+m = df["position"].eq("head").cumsum()
+new_df = df.groupby(m, as_index=False).agg({"position": "last",
+ "start": "first",
+ "end": "last"})
+print(new_df)
+
+Output:
+ position start end
+0 tail 2 15
+1 head 54 564
+2 head 320 390
+3 tail 654 6000
+4 head 9000 9010
+
",python
+"output from bert gives str not tensor in forward functionclass BertModel(nn.Module):
+ def __init__(self,pre_trained='bert-base-uncased'):
+ super().__init__()
+ self.bert = AutoModel.from_pretrained(pre_trained)
+ self.dropout = nn.Dropout(0.1)
+ self.relu = nn.ReLU()
+ self.fc1 = nn.Linear(768,512)
+ self.fc2 = nn.Linear(512,6)
+
+
+
+ def forward(self,inputs, mask, labels):
+
+ pooled, cls_hs = self.bert(input_ids=inputs,attention_mask=mask)
+ print(pooled)
+ print(cls_hs)
+ print(inputs)
+ print(mask)
+ x = self.fc1(cls_hs)
+ print(1)
+ x = self.relu(x)
+ print(2)
+ x = self.dropout(x)
+ print(3)
+ # output layer
+ x = self.fc2(x)
+ print(4)
+ # apply softmax activation
+ x = self.softmax(x)
+ print(5)
+
+
+last_hidden_state
+pooler_output
+
+
+tensor([[ 101, 2342, 2393, ..., 0, 0, 0],
+[ 101, 14477, 4779, ..., 4839, 6513, 102],
+[ 101, 14777, 2111, ..., 13677, 3613, 102],
+...,
+[ 101, 2113, 14047, ..., 0, 0, 0],
+[ 101, 5683, 3008, ..., 0, 0, 0],
+[ 101, 19046, 2075, ..., 2050, 3308, 102]])
+tensor([[1, 1, 1, ..., 0, 0, 0],
+[1, 1, 1, ..., 1, 1, 1],
+[1, 1, 1, ..., 1, 1, 1],
+...,
+[1, 1, 1, ..., 0, 0, 0],
+[1, 1, 1, ..., 0, 0, 0],
+[1, 1, 1, ..., 1, 1, 1]])
+
+in linear(input, weight, bias)
+if has_torch_function_variadic(input, weight, bias):
+return handle_torch_function(linear, (input, weight, bias), input, weight,
+bias=bias)
+return torch._C._nn.linear(input, weight, bias)
+TypeError: linear(): argument 'input' (position 1) must be Tensor, not str
+
+pooled, cls_hs printed as string last_hidden_state, pooler_output tensor
+with out any tensor
+
","Try to replace the line where you downloaded the pretrained bert model with:
+self.model = AutoModel.from_pretrained(pre_trained, return_dict=False)
+
",python
+"How to group a list into a dataframe with four columns?Let's assume I have a list similar to the one below:
+l = ['A','B','C','D','E','F','G','H','I','L','M','N']
+
+
+I want to create a dataframe that has 4 columns from the fact that every 4 objects in the list is a row. The outcome should be a dataframe with the following form:
+Col1 Col2 Col3 Col4
+
+ A B C D
+
+ E F G H
+
+ I L M N
+
+Can anyone help me do it?
+Thanks!
","Convert values to numpy array and then use reshape:
+l = ['A','B','C','D','E','F','G','H','I','L','M','N']
+df = pd.DataFrame(np.array(l).reshape(-1, 4)).add_prefix('col')
+print(df)
+ col0 col1 col2 col3
+0 A B C D
+1 E F G H
+2 I L M N
+
",python
+"Web-scraping with python3.9 with a Load more buttonI am very new to python. I am trying to extract the full list of countries + districts from this website :
+https://www.expo2020dubai.com/en/understanding-expo/participants/country-pavilions
+To get the full list, there is a javascript Load More button. The URL doesn't change when clicking on LOAD MORE Button.
+Mobility
+Albania Pavilion
+I want to extract the District that is on the "content__subtitle" class and the Country that is on the "content__title" class.
+Here is my full script. I can extract from the first page but I don't know how to get the other pages.
+Also, I am writing the result to a CSV file but for some reason nothing is written with my code.
+import requests
+from pprint import pprint
+from bs4 import BeautifulSoup
+import csv
+
+url = "https://www.expo2020dubai.com/en/understanding-expo/participants/country-pavilions"
+
+liste = []
+
+def extract_pavillons(url):
+ r = requests.get(url)
+ soup = BeautifulSoup(r.content, 'html.parser')
+
+ with open('pavillons.csv', 'w', encoding='UTF8') as f:
+ for item in soup.find_all(class_=["c-innovator-card -filter-list","c-innovator-card__container"]):
+ country = item.find(class_='content__title').text
+ country = country[:len(country) - 9]
+ district = item.find(class_='content__subtitle').text
+
+ liste.append([country, district])
+
+
+def write_pavillons(liste):
+ with open('pavillons.csv', 'w', newline='') as f:
+ writer = csv.writer(f)
+ writer.writerows(liste)
+
+extract_pavillons(url)
+
+write_pavillons(liste)
+
","Option I
+Use their api -> developer tools xhr requests:
+https://www.expo2020dubai.com/api/CardFilter/CardFilterLoadMore?ds=%7B1D599577-83BF-4AD3-ADDF-8409AC9CC359%7D¤tCount=8&LoadMoreCount=8&filters=%2C%2C&typeCategory=Country%2C&pageItemId=%7B91F63910-2B2D-45B7-BAA0-9338BBB101C7%7D&_=1636012101921
+With parameter currentCount and LoadMoreCount you can fetch additional items.
+Option II
+Use selenium to click on the button and grab the next items
",python
+"How to transition from M2M created with 'through' to a M2M managed by DjangoI'm working in a Project where there were multiple fields created as M2M using the through related model, however they were wrongly created by then and the related models have no additional fields, so I would like to use a regular M2M managed by Django.
+The existing models I have are:
+cass Student(models.Model):
+ # ... other fields..."
+ courses = models.ManyToManyField(
+ Course,
+ related_name='students',
+ through='StudentCourse'
+ )
+
+class StudentCourse(models.Model):
+ student = models.ForeignKey(Student, on_delete=models.CASCADE)
+ course = models.ForeignKey(Course, on_delete=models.CASCADE)
+
+cass Course(models.Model):
+ # ... other fields..."
+
+
+And I would like to have:
+class Student(models.Model):
+ # ... other fields..."
+ course = models.ManyToManyField(Course, related_name='students')
+
+class Course(models.Model):
+ # ... other fields..."
+
+I'm not able to find a way in Django to do this without losing the data that was already inserted.
+I was thinking of renaming the tables in the same way Django does and manipulate the Django migration metadata but it is an ugly way to solve it.
+Is there a Django way to solve this without losing the data (or creating backup tables)?
","To do this you need to create a migration file that copies the data from the old through table to the new one.
+First create an empty migration file that will be populated with our migration operations
+python manage.py makemigrations <app> --empty
+
+Then that migration needs to be populated with the following (<app> should be replaced with the app name of the app)
+from django.db import migrations, models
+
+
+def move_courses(apps, schema_editor):
+ Student = apps.get_model('<app>', 'Student')
+ StudentCourse = apps.get_model('<app>', 'StudentCourse')
+ for student in Student.objects.all():
+ student.courses.set(
+ StudentCourse.objects.filter(student=student).values_list('course', flat=True)
+ )
+
+
+class Migration(migrations.Migration):
+
+ dependencies = [
+ ('<app>', '<XXXX_previous_migration_name>'),
+ ]
+
+ operations = [
+ # Remove the old ManyToManyField
+ # This won't delete the through table or it's data
+ migrations.RemoveField(
+ model_name='Student',
+ name='courses',
+ ),
+ # Add the new ManyToManyField
+ migrations.AddField(
+ model_name='Student',
+ name='courses',
+ field=models.ManyToManyField(related_name='students', to='<app>.Course'),
+ ),
+ # Run a script that copies data from the old through table to the new one
+ migrations.RunPython(move_courses),
+ # Delete the old through table
+ migrations.DeleteModel(
+ name='StudentCourse',
+ ),
+ ]
+
+Then update your models in the way you like:
+class Student(models.Model):
+ # ... other fields..."
+ courses = models.ManyToManyField(
+ Course,
+ related_name='students',
+ )
+
+# Since through='StudentCourse' was removed, below is removed to (keeping it as comment to represent the removal)
+# class StudentCourse(models.Model):
+# student = models.ForeignKey(Student, on_delete=models.CASCADE)
+# course = models.ForeignKey(Course, on_delete=models.CASCADE)
+
+class Course(models.Model):
+ # ... other fields..."
+
+
+After models and the empty created migration is updated, you can run migrate
+python manage.py migrate
+
+Important: don't run makemigrations again, only migrate in order for Django to run the migration that moves the data.
",python
+"How to include $id fields in Pydantic.schema()According to json-schema.org, it is best practice to include the $id field with objects.
+I'm struggling with how to get this at the top level, for example;
+class MySchema(BaseModel):
+
+ id: str = Field(default="http://my_url/my_schema.json", alias="$id")
+
+
+if __name__ == '__main__':
+ pprint(MySchema.schema())
+
+yields
+{'properties': {'$id': {'default': 'http://my_url/my_schema.json',
+ 'title': '$Id',
+ 'type': 'string'}},
+ 'title': 'MySchema',
+ 'type': 'object'}
+
+How do I get $id at the top level, with title and type, not as a nested property?
","Pydantic provides a number of ways of schema customization. For example, using schema_extra config option:
+from pydantic import BaseModel
+
+
+class Person(BaseModel):
+ name: str
+ age: int
+
+ class Config:
+ schema_extra = {
+ '$id': "my.custom.schema"
+ }
+
+
+print(Person.schema_json(indent=2))
+
+Output:
+{
+ "title": "Person",
+ "type": "object",
+ "properties": {
+ "name": {
+ "title": "Name",
+ "type": "string"
+ },
+ "age": {
+ "title": "Age",
+ "type": "integer"
+ }
+ },
+ "required": [
+ "name",
+ "age"
+ ],
+ "$id": "my.custom.schema"
+}
+
",python
+"How to define the include & lib path when building pybind11 projI am building a pybind11 project with Visual Studio (2017). The setup file is like below:
+
+from setuptools import setup, Extension
+import pybind11
+
+# The following is for GCC compiler only.
+#cpp_args = ['-std=c++11', '-stdlib=libc++', '-mmacosx-version-min=10.7']
+cpp_args = []
+
+sfc_module = Extension(
+ 'test_sample',
+ sources=['Test.cpp'],
+ include_dirs=[pybind11.get_include()],
+ language='c++',
+ extra_compile_args=cpp_args,
+ )
+
+setup(
+ name='test_sample',
+ version='1.0',
+ description='Python package with Test C++ extension (PyBind11)',
+ ext_modules=[sfc_module],
+)
+
+
+Then in the windows power shell, I will run
+
+python setup.py build
+
+However it complains cannot find multiple include files, I believe it will complain about missing library files later too:
+C:\VS2017Pro\VC\Tools\MSVC\xxxx\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Anaconda3_CS\lib\site-packages\pybind11\include -IC:\Anaconda3_CS\include -IC:\Anaconda3_CS\include -IC:\VS2017Pro\VC\Tools\MSVC\xxxx\ATLMFC\include -IC:\VS2017Pro\VC\Tools\MSVC\xxxx\include /EHsc /TpCppPython.cpp /Fobuild\temp.win-amd64-3.7\Release\Test.obj
+Test.cpp
+
+Z:\test_pybind11\stdafx.h(8): fatal error C1083: Cannot open include file: 'targetver.h': No such file or directory
+
+I know where this targetver.h is, just don't know how to add its location to the include path.
+Your help will be greatly appreciated.
","I know where to add more include paths, and the lib paths. One need to add them in the system environment variables: INCLUDE and LIB.
+Control Panel->Edit Environment Variable. Then add all the intended paths for include files to the variable INCLUDE, and add all the library paths to the variable LIB.
+Then the rebuild should be successful.
",python
+How to install two python versions in one computerI have Python 3.10 (64-bit) on my computer. And I use VS Code. I need to install Python 3.8 (64-bit) because I need to work with curses and it only works with Python 3.8.
,"For that, you can use virtual environments for example using Conda.
+
+- Create a virtual environment for python 3.8. In the project folder run the following command
+
+ conda create --name "your-desired-environment-name" python="python-version"
+
+e.g. to create a virtual environment for Python 3.8, run
+ conda create --name env_python3.8 python=3.8
+
+
+- Activate the created environment
+
+ conda activate env_python3.8
+
+Then, in visual studio Code, you can easily switch from a virtual environment to another. That depends on the project that you are working on.
+The following How-to guide from VS Code can also be helpful.
",python
+"Shift multiple daily value columns forward by one year in PandasGiven a dataframe df as follows:
+import pandas as pd
+import numpy as np
+
+np.random.seed(2021)
+dates = pd.date_range('20130101', periods=720)
+df = pd.DataFrame(np.random.randint(0, 100, size=(720, 3)), index=dates, columns=list('ABC'))
+df
+
+Out:
+ A B C
+2013-01-01 85 57 0
+2013-01-02 94 86 44
+2013-01-03 62 91 29
+2013-01-04 21 93 24
+2013-01-05 12 70 70
+ .. .. ..
+2014-12-17 38 42 20
+2014-12-18 67 93 47
+2014-12-19 27 10 74
+2014-12-20 18 92 62
+2014-12-21 90 40 31
+
+How could I shift column B and C forward by one year? Thanks. Please note leap year issue.
+The code below seems works for example data, but for real data with NaNs inside B and C, it generates: ValueError: cannot reindex from a duplicate axis:
+df[['B', 'C']] = df[['B', 'C']].shift(freq=pd.DateOffset(years=1))
+print(df)
+
+Out:
+ A B C
+2013-01-01 85 NaN NaN
+2013-01-02 94 NaN NaN
+2013-01-03 62 NaN NaN
+2013-01-04 21 NaN NaN
+2013-01-05 12 NaN NaN
+ .. ... ...
+2014-12-17 38 33.0 79.0
+2014-12-18 67 24.0 53.0
+2014-12-19 27 54.0 39.0
+2014-12-20 18 68.0 80.0
+2014-12-21 90 65.0 65.0
+
","Code below seems works:
+df[['B', 'C']] = df[['B', 'C']].shift(freq=pd.DateOffset(years=1))
+print(df)
+
+Out:
+ A B C
+2013-01-01 85 NaN NaN
+2013-01-02 94 NaN NaN
+2013-01-03 62 NaN NaN
+2013-01-04 21 NaN NaN
+2013-01-05 12 NaN NaN
+ .. ... ...
+2014-12-17 38 33.0 79.0
+2014-12-18 67 24.0 53.0
+2014-12-19 27 54.0 39.0
+2014-12-20 18 68.0 80.0
+2014-12-21 90 65.0 65.0
+
",python
+"Python Datetime Timezone ShiftI have a time string disseminated from a MQTT broker that I would like to read and convert from its native timezone (U.S. Central Time) to Coordinated Universal Time (UTC). I am currently using Python 3.8.5 in Ubuntu 20.04 Focal Fossa, with the machine timezone set to UTC.
+The time string is as follows: 1636039288.815212
+To work with this time in Python, I am using a combination of the datetime and pytz libraries. My current core of code is as follows:
+from datetime import datetime, timedelta
+import pytz
+
+input = 1636039288.815212
+srctime = datetime.fromtimestamp(input, tz=pytz.timezone('US/Central'))
+
+After running this chunk, I receive the following undesired time output:
+datetime.datetime(2021, 11, 4, 10, 21, 28, 815212, tzinfo=<DstTzInfo 'US/Central' CDT-1 day, 19:00:00 DST>)
+
+It appears that despite explicitly defining 'US/Central' inside the initial timestamp conversion, 5 hours were subsequently subtracted the initial time provided.
+What additional steps/alterations can I make to ensure that the initial time provided is unchanged, defined as US/Central, and that I can subsequently change to UTC?
","Python's fromtimestamp assumes that your input is UNIX time, which should refer to 1970-01-01 UTC, not some arbitrary time zone. If you encounter such a thing nevertheless (another epoch time zone), you'll need to set UTC and then replace the tzinfo:
+from datetime import datetime
+from dateutil import tz # pip install python-dateutil
+
+ts = 1636039288.815212
+
+dt = datetime.fromtimestamp(ts, tz=tz.UTC).replace(tzinfo=tz.gettz("US/Central"))
+
+print(dt)
+# 2021-11-04 16:21:28.815212-05:00
+
+# or in UTC:
+print(dt.astimezone(tz.UTC))
+# 2021-11-04 20:21:28.815212+00:00
+
+Note that I'm using dateutil here so that the replace operation is safe. Don't do that with pytz (you must use localize there). Once you upgrade to Python 3.9 or higher, use zoneinfo instead, so you only need the standard library.
",python
+"Get specific value from functioni was trying to do this:
+def get_sum_col(col):
+ val = ""
+ cont = ""
+ for x in range(1, mxrw1):
+ if type(ws_Sheet1.cell(row=x, column=col).value) == int or type(
+ ws_Sheet1.cell(row=x, column=col).value) == float:
+ if val == "":
+ cont = 1
+ val = ws_Sheet1.cell(row=x, column=col).value
+ else:
+ cont += 1
+ val = int(val) + int(ws_Sheet1.cell(row=x, column=col).value)
+ media = int(val) / int(cont)
+ return val, media
+
+And then to get the value i need something like
+print(get_sum_col(3).media
+print(get_sum_col(3).val
+
+is this possible?
+what am i doing wrong?
","Yes, you can do something like this:
+def get_sum_col(col):
+ ...
+ return {'val': val, 'media': media}
+
+print(get_sum_col(3).get('media', None) # None will print if media is bull
+
",python
+"Run multiple async loops in separate processes within a main async appOk so this is a bit convoluted but I have a async class with a lot of async code.
+I wish to parallelize a task inside that class and I want to spawn multiple processes to run a blocking task and also within each of this processes I want to create an asyncio loop to handle various subtasks.
+SO I short of managed to do this with a ThreadPollExecutor but when I try to use a ProcessPoolExecutor I get a Can't pickle local object error.
+This is a simplified version of my code that runs with ThreadPoolExecutor. How can this be parallelized with ProcessPoolExecutor?
+import asyncio
+import time
+from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
+
+
+class MyClass:
+ def __init__(self) -> None:
+ self.event_loop = None
+ # self.pool_executor = ProcessPoolExecutor(max_workers=8)
+ self.pool_executor = ThreadPoolExecutor(max_workers=8)
+ self.words = ["one", "two", "three", "four", "five"]
+ self.multiplier = int(2)
+
+async def subtask(self, letter: str):
+ await asyncio.sleep(1)
+ return letter * self.multiplier
+
+async def task_gatherer(self, subtasks: list):
+ return await asyncio.gather(*subtasks)
+
+def blocking_task(self, word: str):
+ time.sleep(1)
+ subtasks = [self.subtask(letter) for letter in word]
+ result = asyncio.run(self.task_gatherer(subtasks))
+ return result
+
+async def master_method(self):
+ self.event_loop = asyncio.get_running_loop()
+ master_tasks = [
+ self.event_loop.run_in_executor(
+ self.pool_executor,
+ self.blocking_task,
+ word,
+ )
+ for word in self.words
+ ]
+
+ results = await asyncio.gather(*master_tasks)
+ print(results)
+
+
+if __name__ == "__main__":
+ my_class = MyClass()
+ asyncio.run(my_class.master_method())
+
","This is a very good question. Both the problem and the solution are quite interesting.
+The Problem
+One difference between multithreading and multiprocessing is how memory is handled. Threads share a memory space. Processes do not (in general, see below).
+Objects are passed to a ThreadPoolExecutor simply by reference. There is no need to create new objects.
+But a ProcessPoolExecutor lives in a separate memory space. To pass objects to it, the implementation pickles the objects and unpickles them again on the other side. This detail is often important.
+Look carefully at the arguments to blocking_task in the original question. I don't mean word - I mean the first argument: self. The one that's always there. We've seen it a million times and hardly even think about it. To execute the function blocking_task, a value is required for the argument named "self." To run this function in a ProcessPoolExecutor, "self" must get pickled and unpickled. Now look at some of the member objects of "self": there's an event loop and also the executor itself. Neither of which is pickleable. That's the problem.
+There is no way we can run that function, as is, in another Process.
+Admittedly, the traceback message "Cannot pickle local object" leaves a lot to be desired. So does the documentation. But it actually makes total sense that the program works with a ThreadPool but not with a ProcessPool.
+Note: There are mechanisms for sharing ctypes objects between Processes. However, as far as I'm aware, there is no way to share Python objects directly. That's why the pickle/unpickle mechanism is used.
+The Solution
+Refactor MyClass to separate the data from the multiprocessing framework. I created a second class, MyTask, which can be pickled and unpickled. I moved a few of the functions from MyClass into it. Nothing of importance has been modified from the original listing - just rearranged.
+The script runs successfully with both ProcessPoolExecutor and ThreadPoolExecutor.
+import asyncio
+import time
+# from concurrent.futures import ThreadPoolExecutor
+from concurrent.futures import ProcessPoolExecutor
+
+# Refactored MyClass to break out MyTask
+
+class MyTask:
+ def __init__(self):
+ self.multiplier = 2
+
+ async def subtask(self, letter: str):
+ await asyncio.sleep(1)
+ return letter * self.multiplier
+
+ async def task_gatherer(self, subtasks: list):
+ return await asyncio.gather(*subtasks)
+
+ def blocking_task(self, word: str):
+ time.sleep(1)
+ subtasks = [self.subtask(letter) for letter in word]
+ result = asyncio.run(self.task_gatherer(subtasks))
+ return result
+
+class MyClass:
+ def __init__(self):
+ self.task = MyTask()
+ self.event_loop: asyncio.AbstractEventLoop = None
+ self.pool_executor = ProcessPoolExecutor(max_workers=8)
+ # self.pool_executor = ThreadPoolExecutor(max_workers=8)
+ self.words = ["one", "two", "three", "four", "five"]
+
+ async def master_method(self):
+ self.event_loop = asyncio.get_running_loop()
+ master_tasks = [
+ self.event_loop.run_in_executor(
+ self.pool_executor,
+ self.task.blocking_task,
+ word,
+ )
+ for word in self.words
+ ]
+
+ results = await asyncio.gather(*master_tasks)
+ print(results)
+
+if __name__ == "__main__":
+ my_class = MyClass()
+ asyncio.run(my_class.master_method())
+
",python
+"Is there a simple way to use Oange3 with an Nvidia GPU?I need to compute a high dimension dataset, with clustering on Orange3 app. So, there's too many time spent to calculate the Distance Matrix between the objects. If I could use a graphic card for this tasks it will take much less time to complete the task. Anyone know, let's say, a workaround to do this?
","No. Orange uses numpy arrays and computes distances on CPU. Short of reimplementing the routine for calculation of distances (which in itself is rather short and simple), there's nothing you can do about it.
+Orange will start using Dask in some not too distant future, but until then try reducing your data set. You may not need all dimensions and/or objects for your clustering.
",python
+"Extract EPSG code from GeoDataFrame.crs resultLet's say I have a GeoDataFrame with a CRS set.
+gdf.crs
+
+gives me
+<Projected CRS: EPSG:25833>
+Name: ETRS89 / UTM zone 33N
+Axis Info [cartesian]:
+- [east]: Easting (metre)
+- [north]: Northing (metre)
+Area of Use:
+- undefined
+Coordinate Operation:
+- name: UTM zone 33N
+- method: Transverse Mercator
+Datum: European Terrestrial Reference System 1989
+- Ellipsoid: GRS 1980
+- Prime Meridian: Greenwich
+
+This is of type <class 'pyproj.crs.crs.CRS'>.
+Is there a way to extract the EPSG Code from this, thus 25833?
",".crs returns a pyroj.CRS object. This should get you the EPSG code from the object:
+gdf.crs.to_epsg()
+
+pyproj docs
",python
+"How To Fetch Projec quotas using OpenstackSDK?I am looking for a small code to fetch all my projects quotas limits. From openstacksdk dev repository i can see there is connection called compute.v2.limits, but using that i m getting blank output like below.
+import openstack
+import openstack.compute.v2.limits
+import os
+import datetime
+
+################## getting limits ##########################
+ip='10.6.X.X'
+auth_url_domain="https://10.6.X.X:13000/v3"
+conn = openstack.connect(auth_url=auth_url_domain, project_name="admin", username="admin", domain_name="default",password="XXXXXX", cacert="/root/app/cacrt/cacrt/ca.crt10.6.X.X.pem")
+data=openstack.compute.v2.limits.AbsoluteLimits(session=conn)
+print(data.to_dict())
+
+Output i m getting ( key is ok but values are none . I want values should be as per values set for all projects):
+{'server_groups': None, 'keypairs': None, 'total_ram_used': None,
+ 'total_cores': None, 'instances_used': None, 'total_cores_used': None,
+ 'instances': None, 'id': None, 'security_groups': None,
+ 'total_ram': None, 'floating_ips': None, 'location': None,
+ 'personality_size': None, 'server_groups_used': None,
+ 'personality': None, 'security_group_rules': None,
+ 'name': None, 'server_group_members': None, 'floating_ips_used': None,
+ 'image_meta': None, 'server_meta': None, 'security_groups_used': None}
+
","You could get the project quota only by openstack.connection, like this:
+conn = openstack.connect(auth_url=auth_url_domain...)
+quotas = conn.get_compute_quotas(project_name_or_id)
+
+And there is more other common methods, you could check it while you need.
+
+I want values should be as per values set for all projects
+
+If you want to get all projects quotas, you should create the consistent connection because of the connection bind the project as the parameter of openstack.connect method. But if you are the admin role, you could get all projects info.
",python
+"Parse a multi-layered JSON from INSEE APII'm sending a request to an API. This API is INSEE, it is the French official repositories of registered companies.
+Here is my request:
+headers = {
+ 'Accept': 'application/json',
+ 'Authorization': 'xxx'
+}
+
+params = (
+ ('q', 'siren:530085802'),
+ ('date', '2021-10-01'),
+ ('champs', 'siret, denominationUniteLegale, codePostalEtablissement, libelleCommuneEtablissement, denominationUsuelleEtablissement'),
+ ('debut', 1)
+ # ('nombre', 1)
+)
+
+response = requests.get('https://api.insee.fr/entreprises/sirene/V3/siret', headers=headers, params=params)
+
+For this example, I'm requesting Facebook's legal entity by its number : siren:530085802 (this is Open Data, nothing here is confidential).
+Now I get a response:
+reponse.text
+
+This :
+'{"header":{"statut":200,"message":"OK","total":3,"debut":1,"nombre":2},"etablissements":[{"siret":"53008580200011","uniteLegale":{"denominationUniteLegale":"FACEBOOK FRANCE"},"adresseEtablissement":{"codePostalEtablissement":"75116","libelleCommuneEtablissement":"PARIS 16"},"periodesEtablissement":[{"dateFin":null,"dateDebut":"2012-06-14","denominationUsuelleEtablissement":null},{"dateFin":"2012-06-13","dateDebut":"2011-02-17","denominationUsuelleEtablissement":null},{"dateFin":"2011-02-16","dateDebut":"2011-01-01","denominationUsuelleEtablissement":null}]},{"siret":"53008580200037","uniteLegale":{"denominationUniteLegale":"FACEBOOK FRANCE"},"adresseEtablissement":{"codePostalEtablissement":"75002","libelleCommuneEtablissement":"PARIS 2"},"periodesEtablissement":[{"dateFin":null,"dateDebut":"2018-06-26","denominationUsuelleEtablissement":null},{"dateFin":"2018-06-25","dateDebut":"2016-04-18","denominationUsuelleEtablissement":null}]}]}'
+
+As per the official documentation, there are multiple layers :
+
+- Legal entity : Identified by a
siren
+- Establishment : Identified by a
siret – which is the SIREN + 5 characters. Basically here I'm getting a list of establishments, since I'm requesting only one SIREN
+- Periods : Which is archived/change tracking of several attributes. With a
dateDebut (starting date) and a dateFin (end date) and attributes, among which the denominationUsuelleEtablissement I'd like to have the current values.
+
+Despite putting a date in my request, the API responded with several periods, not only the one that includes the date I'm providing. Hence adding a layer.
+What I'm trying to do is to flatten the response and convert it to a dataframe for easier use.
+What I've done is the following. Using json.normalize:
+import json
+import pandas as pd
+
+contenuReponse = json.loads(response.text)
+etablissements = contenuReponse['etablissements']
+
+df = pd.json_normalize(etablissements)
+df
+
+I get this:
+
+
+
+
+ |
+siret |
+periodesEtablissements |
+uniteLegale.denominationUniteLegale |
+
+
+
+
+| 0 |
+53008580200011 |
+[{'dateFin': None, 'dateDebut': '2012-06-14', ... |
+FACEBOOK FRANCE |
+
+
+| 1 |
+53008580200037 |
+[{'dateFin': None, 'dateDebut': '2018-06-26', ... |
+FACEBOOK FRANCE |
+
+
+
+
+json_normalize ignores the layer below periodesEtablissement.
+Is there a way to flatten successfully the whole response. Or, use only the periodesEtablissement relevant to my request. Meaning the one with no dateFin?
","You could try using this function:
+def flatten_nested_json_df(df):
+ df = df.reset_index()
+ s = (df.applymap(type) == list).all()
+ list_columns = s[s].index.tolist()
+
+ s = (df.applymap(type) == dict).all()
+ dict_columns = s[s].index.tolist()
+
+
+ while len(list_columns) > 0 or len(dict_columns) > 0:
+ new_columns = []
+
+ for col in dict_columns:
+ horiz_exploded = pd.json_normalize(df[col]).add_prefix(f'{col}.')
+ horiz_exploded.index = df.index
+ df = pd.concat([df, horiz_exploded], axis=1).drop(columns=[col])
+ new_columns.extend(horiz_exploded.columns) # inplace
+
+ for col in list_columns:
+ print(f"exploding: {col}")
+ df = df.drop(columns=[col]).join(df[col].explode().to_frame())
+ new_columns.append(col)
+
+ s = (df[new_columns].applymap(type) == list).all()
+ list_columns = s[s].index.tolist()
+
+ s = (df[new_columns].applymap(type) == dict).all()
+ dict_columns = s[s].index.tolist()
+ return df
+
+and do this:
+flatten_nested_json_df(df)
+
+which gives:
+ index siret uniteLegale.denominationUniteLegale \
+0 0 53008580200011 FACEBOOK FRANCE
+0 0 53008580200011 FACEBOOK FRANCE
+0 0 53008580200011 FACEBOOK FRANCE
+1 1 53008580200037 FACEBOOK FRANCE
+1 1 53008580200037 FACEBOOK FRANCE
+
+ adresseEtablissement.codePostalEtablissement \
+0 75116
+0 75116
+0 75116
+1 75002
+1 75002
+
+ adresseEtablissement.libelleCommuneEtablissement \
+0 PARIS 16
+0 PARIS 16
+0 PARIS 16
+1 PARIS 2
+1 PARIS 2
+
+ periodesEtablissement.dateFin periodesEtablissement.dateDebut \
+0 None 2012-06-14
+0 2012-06-13 2011-02-17
+0 2011-02-16 2011-01-01
+1 None 2018-06-26
+1 2018-06-25 2016-04-18
+
+ periodesEtablissement.denominationUsuelleEtablissement
+0 None
+0 None
+0 None
+1 None
+1 None
+
+
",python
+"Python Pandas: Best way to find local maximums in large DFI have a large dataframe that consitsts of many cycles, each cycle has 2 maximum peak values inside that I need to capture into another dataframe.
+I have created a sample data frame that mimics the data I am seeing:
+import pandas as pd
+
+data = {'Cycle':[1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3], 'Pressure':[100,110,140,180,185,160,120,110,189,183,103,115,140,180,200,162,125,110,196,183,100,110,140,180,185,160,120,180,201,190]}
+
+df = pd.DataFrame(data)
+
+As you can see in each cycle there are two maxes but the part I was having trouble with was that the 2nd peak is usaully higher than the first peak, so there could be rows of numbers technicially higher than the other peaks max in the cycle. The results should look something like this:
+data2 = {'Cycle':[1,1,2,2,3,3], 'Peak Maxs': [185,189,200,196,185,201]}
+
+df2= pd.DataFrame(data2)
+
+I have tried a couple methods including .nlargest(2) per cycle, but the problem is that since one of the peaks is usually higher it will pull the 2nd highest number in the data, which isnt necesssarily the other peak.
+This graph shows the peak pressures from each cycle that I would like to be able to find.
+![]()
+Thanks for any help.
","From scipy argrelextrema
+from scipy.signal import argrelextrema
+out = df.groupby('Cycle')['Pressure'].apply(lambda x : x.iloc[argrelextrema(x.values, np.greater)])
+Out[124]:
+Cycle
+1 4 185
+ 8 189
+2 14 200
+ 18 196
+3 24 185
+ 28 201
+Name: Pressure, dtype: int64
+
+out = out.sort_values().groupby(level=0).tail(2).sort_index()
+out
+Out[138]:
+Cycle
+1 4 185
+ 8 189
+2 14 200
+ 18 196
+3 24 185
+ 28 201
+Name: Pressure, dtype: int64
+
",python
+"Logistic regression (from scratch) predicts 0.5 for everythingNeed help on writing my own logistic regression model.
+Input data (generated via sklearn.datasets.make_classification) looks like this (printed for 10 entries):
+[[ 0.74186571 -1.69239663 2.06965145]
+ [-1.80076727 0.59700581 -1.57159523]
+ [ 1.0328198 0.62274582 0.90241322]
+ [-0.63972474 2.12054103 1.30124807]
+ [ 1.04275475 -0.86879077 1.08399317]
+ [-1.12772782 0.26396098 -1.68130012]
+ [ 0.92281318 -1.15431326 0.23868389]
+ [-0.37260971 -0.97979894 1.65890322]
+ [ 0.4513904 0.30502349 2.46449598]
+ [-2.79502998 0.05500871 -2.47725562]]
+
+Output like this:
+[0. 0. 1. 1. 0. 1. 0. 0. 0. 0.]
+
+Here is my code:
+import numpy
+import math
+
+# import data
+path = r"C:\Users\felix\sciebo2\Atom_working_dir\ML_sandbox\\" # put in global path here
+x_logregdata = numpy.genfromtxt(path+"x_logregdata.csv", delimiter=",")
+y_logregdata = numpy.genfromtxt(path+"y_logregdata.csv", delimiter=",")
+
+class MyLogReg:
+
+ def __init__(self, n):
+ self._numinputs = n
+ self._weights = numpy.random.rand(n)
+ self._bias = 0 # or = numpy.random.rand(1)
+ self._lam = 0.01 # lambda parameter for regularization
+
+ def pred(self, x):
+
+ return sigmoid(numpy.matmul(x, self._weights) + self._bias)
+
+# -----------------------------------------------------------------------------
+# functions
+
+def logloss(net,x,y): # p=prediction, t=target
+ p = sigmoid(numpy.matmul(x, net._weights) + net._bias) # (1000,3)x(3,1) becomes (1000,1)
+ logloss_term = (numpy.matmul(-y.T, numpy.log(p)) - numpy.matmul((1-y.T),numpy.log(1-p)))/len(p) # transpose, so (1,1000)x(1000,1) becomes (1)
+ regularization_term = numpy.mean(net._weights**2 * net._lam/2)
+ J = logloss_term + regularization_term
+ return J
+
+def sigmoid(z):
+ # z = b + w1x1 + w2x2 + ... + wnxn
+ return 1/(1+numpy.exp(-z))
+
+def train(net,x,t,epochs,lr):
+ for epoch in range(epochs):
+ # x comes in the shape of {n_samples, n_features}
+ p = net.pred(x) # make predictions
+ e = logloss(net,x,t)
+ grad_w = (numpy.matmul(x.T,(p-t)) + net._lam*net._weights)/len(p) # dim: (3,1)
+ grad_b = delta_bi = numpy.mean(p-t)
+ if (epoch+1)%100 == 0:
+ print(f"Epoch {epoch+1} | Error: {e} with weights {NN._weights} and bias {NN._bias}")
+ print(f"grad_w: {grad_w} --- grad_b: {grad_b}")
+ # update weights and bias
+ NN._weights += lr*(-grad_w)
+ NN._bias += lr*(-grad_b)
+
+NN = MyLogReg(3)
+
+# # -----------------------------------------------------------------------------
+# # training process
+print(80*"--")
+print("Training process started!\n")
+print(numpy.shape(x_combined),numpy.shape(x_logregdata))
+train(NN, x_logregdata, y_logregdata, 100, 0.5)
+
+Training output at last epoch:
+Epoch 1000 | Error: 0.6919551095870804 with weights [ 0.01013472 0.04960763 -0.06680454] and bias -0.012293556999268884
+grad_w: [-4.62077479e-18 6.62621000e-18 7.92638524e-18] --- grad_b: -1.3322676295501879e-18
+
+After training on the entire data set, predictions look random and are all close to 0.5, even though gradients have become very small:
+for i in range(10):
+ print((NN.pred(x_logregdata[i,:]), y_logregdata[i]))
+
+
+OUT:
+(0.4434940888958877, 0.0)
+(0.5259920670963649, 0.0)
+(0.49219604260551786, 1.0)
+(0.4998723781629637, 1.0)
+(0.4707235000047709, 0.0)
+(0.525400671366715, 1.0)
+(0.48097184157406053, 0.0)
+(0.4562378027506945, 0.0)
+(0.46077410720140005, 0.0)
+(0.531856873614567, 0.0)
+
+Github repo is here: Click.
","It's unclear why you want to initialize your features with random integer weights, if you pass equal weights to them, it converges pretty ok with 100 epochs:
+class MyLogReg:
+
+ def __init__(self, n):
+ self._numinputs = n
+ self._weights = numpy.repeat(1.0,n)
+ self._bias = 0 # or = numpy.random.rand(1)
+ self._lam = 0.01 # lambda parameter for regularization
+
+ def pred(self, x):
+
+ return sigmoid(numpy.matmul(x, self._weights) + self._bias)
+
+import numpy as np
+from sklearn.datasets import make_classification
+
+X,y = make_classification(n_features=3,n_redundant=1,n_informative=2,
+class_sep=0.7,random_state=22)
+
+NN = MyLogReg(3)
+train(NN, X, y, 100, 0.5)
+
+Epoch 100 | Error: 0.3930719987698364 with weights [0.08555415 1.38852617 1.65479616] and bias 0.27656565660444704
+grad_w: [ 7.85974808e-05 -8.63304471e-04 -1.02206054e-03] --- grad_b: -0.00022828688162301436
+
+NN.pred(X)[:20]
+
+array([0.04147539, 0.75444612, 0.92599311, 0.92906995, 0.330026 ,
+ 0.96483765, 0.90184527, 0.21597661, 0.43732915, 0.17307697,
+ 0.98569769, 0.11334725, 0.90186428, 0.96431985, 0.27836055,
+ 0.05338276, 0.02682678, 0.96073064, 0.32182455, 0.57531559])
+
+We can check the training accuracy:
+from sklearn.metrics import confusion_matrix
+confusion_matrix(y,(NN.pred(X)>0.5).astype(int))
+
+array([[43, 7],
+ [ 6, 44]])
+
+If you would really want to initialize with weights from a random uniform, I suspect you need to increase the learning rate for it to converge.
",python
+"Fill merged columns with 0 instead of NaNI have a problem. I want to merge two dataframes, but instead of NaN it should be filled with 0. But only the "new" columns. How could I do that?
+What I tried
+df3 = pd.merge(df2, grouped_df_one,on=['id', 'host_id'], how='left', fill = 0)
+[OUT]
+TypeError: merge() got an unexpected keyword argument 'fill'
+
+d = {'host_id': [1, 1, 2],
+ 'id': [10, 11, 20],
+ 'value': ["Hot Water,Cold Water,Kitchen,Coffee",
+ "Hot Water,Coffee,Something",
+ "Hot Water,Coffee"]}
+df = pd.DataFrame(data=d)
+print(df)
+
+
+d2 = {'host_id': [1, 1, 2, 3],
+ 'id': [10, 11, 20, 30],
+ 'some': ['test1', "test2", "test3", np.nan]}
+df2 = pd.DataFrame(data=d2)
+print(df2)
+
+df_path = df.copy()
+df_path.index = pd.MultiIndex.from_arrays(df_path[['host_id', 'id']].values.T, names=['host_id', 'id'])
+df_path = df_path['value'].str.split(',', expand=True)
+df_path = df_path.melt(ignore_index=False).dropna()
+df_path.reset_index(inplace=True)
+
+one_hot = pd.get_dummies(df_path['value'])
+df_one = df_path.drop('value',axis = 1)
+df_one = df_path.join(one_hot)
+
+grouped_df_one = df_one.groupby(['id']).max()
+grouped_df_one = grouped_df_one.drop(columns=['value', 'variable']).reset_index()
+
+df3 = pd.merge(df2, grouped_df_one,on=['id', 'host_id'], how='left')
+df3
+
+ host_id id value
+0 1 10 Hot Water,Cold Water,Kitchen,Coffee
+1 1 11 Hot Water,Coffee,Something
+2 2 20 Hot Water,Coffee
+
+ host_id id some
+0 1 10 test1
+1 1 11 test2
+2 2 20 test3
+3 3 30 NaN
+
+
+What I got
+ host_id id some Coffee Cold Water Hot Water Kitchen Something
+0 1 10 test1 1.0 1.0 1.0 1.0 0.0
+1 1 11 test2 1.0 0.0 1.0 0.0 1.0
+2 2 20 test3 1.0 0.0 1.0 0.0 0.0
+3 3 30 NaN NaN NaN NaN NaN NaN
+
+What I want
+ host_id id some Coffee Cold Water Hot Water Kitchen Something
+0 1 10 test1 1.0 1.0 1.0 1.0 0.0
+1 1 11 test2 1.0 0.0 1.0 0.0 1.0
+2 2 20 test3 1.0 0.0 1.0 0.0 0.0
+3 3 30 NaN 0 0 0 0 0
+
","You can fill specific columns using
+df[list_cols] = df[list_cols].fillna(0)
+
+where list_cols is e.g.
+list_cols = ["Coffee", "Cold Water", "Hot Water", "Kitchen", "Something"]
+
+See: Pandas fill multiple columns with 0 when null
",python
+"Keras multivariate time series forecasting model returns NaN as MAE and lossI have multivariate time series data, collected every 5 seconds for a few days.
+This includes columns of standardized data, which looks like below (few example values). "P1" is the label-column.
+|-------|-----------------------|-----------------------|----------------------|-----------------------|-----------------------|------------------------|------------------------|----------------------|----------------------|
+| | P1 | P2 | P3 | AI_T_MOWA | AI_T_OEL | AI_T_KAT_EIN | AI_T_KAT_AUS | P-Oel | P-Motorwasser |
+|-------|-----------------------|-----------------------|----------------------|-----------------------|-----------------------|------------------------|------------------------|----------------------|----------------------|
+| 0 | 0.8631193380009695 | 0.8964414887167506 | 0.8840858759128901 | -0.523186057460264 | -0.6599697679790338 | 0.8195843978382326 | 0.6536355179773343 | 2.0167991331023862 | 1.966765280217274 |
+|-------|-----------------------|-----------------------|----------------------|-----------------------|-----------------------|------------------------|------------------------|----------------------|----------------------|
+| 1 | 2.375731412346451 | 2.416190921505275 | 2.3921080971495456 | 1.2838015319452019 | 0.6783070711474897 | 2.204838829646018 | 2.250184559609546 | 2.752702514412287 | 2.7863834647854797 |
+|-------|-----------------------|-----------------------|----------------------|-----------------------|-----------------------|------------------------|------------------------|----------------------|----------------------|
+| 2 | 2.375731412346451 | 2.416190921505275 | 2.3921080971495456 | 1.2838015319452019 | 1.2914092683827934 | 2.2484584825559955 | 2.2968465552769324 | 2.4571347629025726 | 2.743245665597679 |
+|-------|-----------------------|-----------------------|----------------------|-----------------------|-----------------------|------------------------|------------------------|----------------------|----------------------|
+| 3 | 2.3933199248388406 | 2.416190921505275 | 2.3753522946913606 | 1.2838015319452019 | 1.5485166414169536 | 2.2557284247076588 | 2.3039344533529906 | 2.31839887954087 | 2.7863834647854797 |
+|-------|-----------------------|-----------------------|----------------------|-----------------------|-----------------------|------------------------|------------------------|----------------------|----------------------|
+
+Corresponding graphs of the standardized data show nothing out of the ordinary.
+![]()
+I have split this data into train, validation and test sets, so that my training data is the first 70% of overall data, the validation are the next 20% and the test are the last 10%.
+train_df_st = df[0:int(self._n*0.7)]
+val_df_st = df[int(self._n*0.7):int(self._n*0.9)]
+test_df_st = df[int(self._n*0.9):]
+
+I then generate windows through the WindowGenerator class from tensorflows tutorial like here.
+Using a simple Baseline model that predicts the ouput the same as the input I get actual predictions, so I assume my generated windows are fine.
+The shapes of my batches are
+Input shape: (32, 24, 193)
+Output shape: (32, 24, 1)
+
+Now to the tricky part:
+I obviously want to use another model for better predictions. I have tried out using Conv1D using only one column and that worked, so I wanted to try it with this as well.
+My windows look like:
+CONV_WIDTH = 3
+LABEL_WIDTH = 24
+INPUT_WIDTH = LABEL_WIDTH + (CONV_WIDTH - 1)
+conv_window = WindowGenerator(
+ input_width=INPUT_WIDTH,
+ label_width=LABEL_WIDTH,
+ shift=1,
+ train_df=train_df_st, val_df=val_df_st, test_df=test_df_st, label_columns=["P1"])
+
+Total window size: 25
+Input indices: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
+Label indices: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]
+Label column name(s): ['P1']
+
+I then define my model and use the compile_and_fit() method as used here.
+conv_model = tf.keras.Sequential([
+ tf.keras.layers.Conv1D(filters=32,
+ kernel_size=(CONV_WIDTH,),
+ activation='relu'),
+ tf.keras.layers.Dense(units=32, activation='relu'),
+ tf.keras.layers.Dense(units=1),
+])
+
+MAX_EPOCHS = 20
+
+def compile_and_fit(model, window, patience=2):
+ early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
+ patience=patience,
+ mode='min')
+
+ model.compile(loss=tf.losses.MeanSquaredError(),
+ optimizer=tf.optimizers.Adam(),
+ metrics=[tf.metrics.MeanAbsoluteError()])
+
+ history = model.fit(window.train, epochs=MAX_EPOCHS,
+ validation_data=window.val,
+ callbacks=[early_stopping])
+ return history
+
+history = compile_and_fit(window=conv_window, model=conv_model)
+
+Input and Output shapes are:
+Input shape: (32, 26, 193)
+Output shape: (32, 24, 1)
+
+My final output however is only two epochs that show nan as mean absolute error as well as loss:
+Epoch 1/20
+382/382 [==============================] - 2s 4ms/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan
+Epoch 2/20
+382/382 [==============================] - 1s 3ms/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan
+
+And if I plot some example windows I see that I get labels, but no predictions:
+![]()
+I have tried implementing yet another model (LSTM) with slightly different windows, but a similar approach, but I get the same NaN's, so I believe it is not my models problem, but something in my data?.
","Turns out my standarization of the data was faulty, normalizing it, I get actual values instead of NaN.
",python
+"How to locate a checkbox to verify its state using Selenium and PythonI am trying to figure out if a checkbox on our website is in a checked state or not. The HTML for the checkbox is as follows:
+<input id="ID_StaffIsRostered" name="ID_Rostering" type="checkbox" checked="" data-toggle="toggle" data-onstyle="success" data-size="sm" data-on="Yes" data-off="No" onchange="toggleRosteredStaff()">
+
+I've tried finding this via these locators:
+
+(By.ID, "ID_StaffIsRostered")
+(By.XPATH, "input[@name='ID_Rostering']"
+(By.XPATH,"//*[@id='ID_StaffIsRostered']")
+(By.XPATH,"/html/body/div[1]/div/div[2]/div/div/div/div[2]/table[3]/tbody/tr/td/div[1]/div[2]/div/form/div/div[1]/div/label/div/input")
+(By.CSS_SELECTOR,"label:nth-child(1) > .btn-light .toggle-off")
+(By.XPATH,"//form[@id='ID_Form_Rostering']/div/div/div/label/div/div/label[2]")
+(By.XPATH,"//div/div/div/label/div/div/label[2]")
+
+Nothing has worked so far. I can however select the div containing the checkbox, and the labels that the checkboxes switch between depending on what it is toggle to, the HTML is:
+<div class="toggle btn btn-success btn-sm" data-toggle="toggle" style="width: 7.5px; height: 26px; min-width: 38px;">
+
+and for the labels:
+<div class="toggle-group"><label class="btn btn-success btn-sm toggle-on">Yes</label><label class="btn btn-light btn-sm toggle-off">No</label><span class="toggle-handle btn btn-light btn-sm"></span></div>
+
+What are some alternative ways I can achiever this? Currently I'm using Selenium 4.0.0 with Python 3.9
","As per the HTML Specification for Boolean attributes:
+
+A number of attributes are Boolean attributes. The presence of a
+boolean attribute on an element represents the true value, and the
+absence of the attribute represents the false value.
+If the attribute is present, its value must either be the empty string
+or a value that is an ASCII case-insensitive match for the attribute's
+canonical name, with no leading or trailing whitespace.
+
+The checked attribute can be represented in either of the ways:
+<input name=name id=id type=checkbox checked>
+<input name=name id=id type=checkbox checked="">
+<input name=name id=id type=checkbox checked="checked">
+
+
+Solution
+To verify the state of the checkbox you can probe the checked attribute and you can use the following code block:
+try:
+ WebDriverWait(browser, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "input#ID_StaffIsRostered[name='ID_Rostering'][checked]")))
+ print("Checkbox is in checked state")
+except:
+ print("Checkbox is in unchecked state")
+
",python
+extract channel names from a multi-channel imageI am using skimage.io.imread (which uses tifffile) to read a QPTIFF file. Multiple channels are successfully read as multiple dimensions. Is it possible to extract the channel names and other metadata?
,"PerkinElmer QPI metadata are stored as XML in the ImageDescription TIFF tags. To read the XML metadata, use the tifffile.TiffFile class, e.g.:
+from xml.etree import ElementTree
+from tifffile import TiffFile
+
+with TiffFile('LuCa-7color_Scan1.tiff') as tif:
+ for page in tif.series[0].pages:
+ print(ElementTree.fromstring(page.description).find('Name').text)
+
",python
+"Pandas assign value based on next row(s)Consider this simple pandas DataFrame with columns 'record', 'start', and 'param'. There can be multiple rows with the same record value, and each unique record value corresponds to the same start value. However, the 'param' value can be different for the same 'record' and 'start' combination:
+pd.DataFrame({'record':[1,2,3,4,4,5,6,7,7,7,8], 'start':[0,5,7,13,13,19,27,38,38,38,54], 'param':['t','t','t','u','v','t','t','t','u','v','t']})
+
+I'd like to make a column 'end' that takes the value of 'start' in the row with the next unique value of 'record'. The values of column 'end' should be:
+[5,7,13,19,19,27,38,54,54,54,NaN]
+
+I'm able to do this using a for loop, but I know this is not preferred when using pandas:
+max_end = 100
+for idx, row in df.iterrows():
+ try:
+ n = 1
+ next_row = df.iloc[idx+n]
+ while next_row['start'] == row['start']:
+ n = n+1
+ next_row = df.iloc[idx+n]
+ end = next_row['start']
+ except:
+ end = max_end
+ df.at[idx, 'end'] = end
+
+Is there an easy way to achieve this without a for loop?
","I have no doubt there is a smarter solution but here is mine.
+df1['end'] = df1.drop_duplicates(subset = ['record', 'start'])['start'].shift(-1).reindex(index = df1.index, method = 'ffill')
+
+-=EDIT=-
+Added subset into drop_duplicates to account for question amendment
",python
+"Creating Start & End Date features out from one date columnI'm trying to make a new feature out from one date column to provide start and end date.
+Here's what it looks like:
+unique_id = ["001", "001", "001",
+ "002",
+ "003", "003"
+ ]
+end_dates = ["2018-10-31 12:43:03 PM", "2018-10-31 12:44:23 PM", "2018-10-31 1:01:42 PM",
+ "2018-11-23 03:33:13 PM",
+ "2018-11-23 04:10:45 PM", "2018-11-23 04:13:58 PM"
+
+ ]
+activity_class = ["step 1", "step 2", "step 3",
+ "step 1",
+ "step 1", "step 2"
+ ]
+
+df = \
+pd.DataFrame({"ID": unique_id,
+ "Edit Date": end_dates,
+ "Activity": activity_class
+ })
+
+df["Edit Date"] = pd.to_datetime(df["Edit Date"])
+
+Here's how I want it to look like:
+unique_id = ["001", "001", "001",
+ "002",
+ "003", "003"
+ ]
+
+
+start_date = ["2018-10-31 12:43:03 PM", "2018-10-31 12:43:03 PM", "2018-10-31 12:44:23 PM",
+ "2018-11-23 03:33:13 PM",
+ "2018-11-23 04:10:45 PM", "2018-11-23 04:10:45 PM"
+ ]
+
+end_date = ["2018-10-31 12:43:03 PM", "2018-10-31 12:44:23 PM", "2018-10-31 1:01:42 PM",
+ "2018-11-23 03:33:13 PM",
+ "2018-11-23 04:10:45 PM", "2018-11-23 04:13:58 PM"
+ ]
+
+activity_class = ["step 1", "step 2", "step 3",
+ "step 1",
+ "step 1", "step 2"
+ ]
+
+df = \
+pd.DataFrame({"ID": unique_id,
+ "Start_Date": start_date,
+ "End_Date": end_date,
+ "Activity": activity_class
+ })
+
+df["Start_Date"] = pd.to_datetime(df["Start_Date"])
+df["End_Date"] = pd.to_datetime(df["End_Date"])
+
+What I tried so far:
+df["Start_Date"] = df["Edit Date"].shift(1).backfill()
+
+Some of the rules:
+
+- Data is sorted ascending by unique id and date
+- regardless of the label in "activity" as long as it's the first one, it the date should be the same for start and end
+- next activity's start date should copy the previous activity's end date
+
","You are looking for groupby().shift()?
+df['Start_Date'] = df['Edit Date']
+df['End_Date'] = df.groupby('ID')['Edit Date'].shift().fillna(df['Edit Date'])
+
+Output:
+ ID Edit Date Activity Start_Date End_Date
+0 001 2018-10-31 12:43:03 step 1 2018-10-31 12:43:03 2018-10-31 12:43:03
+1 001 2018-10-31 12:44:23 step 2 2018-10-31 12:44:23 2018-10-31 12:43:03
+2 001 2018-10-31 13:01:42 step 3 2018-10-31 13:01:42 2018-10-31 12:44:23
+3 002 2018-11-23 15:33:13 step 1 2018-11-23 15:33:13 2018-11-23 15:33:13
+4 003 2018-11-23 16:10:45 step 1 2018-11-23 16:10:45 2018-11-23 16:10:45
+5 003 2018-11-23 16:13:58 step 2 2018-11-23 16:13:58 2018-11-23 16:10:45
+
",python
+"How to extract from text digits and calculate them in a specific order?I'm scraping data from website which I want to analyze. In the section about job experience, I extract text which specifies how long someone works at the company - this information looks like this:
+
+Employment period \n2years 2 mon.
+
+It would be easier to analyze the period of employment expressed in months. Now I wonder how to extract this information from the text and calculate this properly.
+Calculation of the given example should be:
+
+2 x 12 + 2
+
+I try to do this in that way:
+def text_format(text: str):
+ digits = []
+ text = text.replace('\\n', ' ')
+ text = text.replace('.', '')
+ text = text.split()
+ for word in text:
+ if word.isalpha():
+ pass
+ else:
+ word = int(word)
+ digits.append(word)
+
+ total = digits[0] * 12 + digits[1]
+
+ return total
+
+In this particular case, the function above works well, but I may have other situations, e.g.
+
+Employment period \n3years
+
+OR
+
+Employment period 11 mon.
+
+I have no idea how to handle all possible scenarios.
","You can use a regex to tackle this problem and cover all possible scenarios. For example, a one like below should make this task much easier:
+Employment period(?:[ \\n]+(\d+)[ ]*years?)?(?:[ \\n]+(\d+)[ ]*mon\.)?
+
+You can try it out here on the Regex Demo as well.
+Here's a Python example that runs through the specific use cases mentioned, along with some additional edge cases that I added:
+import re
+
+pattern = re.compile(r'Employment period(?:[ \\n]+(\d+)[ ]*years?)?(?:[ \\n]+(\d+)[ ]*mon\.)?')
+
+string = r"""\
+Employment period \n2years 2 mon.
+Employment period \n3years
+Employment period 11 mon.
+Employment period 010 years
+Employment period 1 year
+Employment period
+testing\
+"""
+
+for x in pattern.finditer(string):
+ print('Found a match:', x.group(0))
+ years, months = x.groups()
+ if years or months:
+ total_months = int(years or 0) * 12 + int(months or 0)
+ print(f'Total months: {total_months}')
+
+Output:
+Found a match: Employment period \n2years 2 mon.
+Total months: 26
+Found a match: Employment period \n3years
+Total months: 36
+Found a match: Employment period 11 mon.
+Total months: 11
+Found a match: Employment period 010 years
+Total months: 120
+Found a match: Employment period 1 year
+Total months: 12
+Found a match: Employment period
+
",python
+"Print just the value of one json key response with pythonI'm new on it all.
+I have a url that returns me a token, in a json format. This token changes all the time I call this url. It's something like this:
+{
+"token": {
+ "token": "randomtoken",
+ "result": 1,
+ "resultCode": "2",
+ "requestId": "3"
+}}
+
+I want to print just the result of token key when I call my python code, which is like this:
+import requests as req
+
+resp = req.get("http://myurl.com.br")
+
+print(resp.text)
+
+This python code is returning me the following result:
+{"token":{"token":"randomtoken","result":1,"resultCode":"2","requestId":"3"}}
+
+How can I print just the token key result? Like just:
+"randomtoken"
+
+It's possible?
","You have a dict so you may access it with its keys as string
+import requests as req
+
+resp = req.get("http://myurl.com.br")
+token_key = resp.json()['token']['token']
+print(token_key)
+
+
+Note that
+
+resp.text returns a string, you would need json.loads(resp.text) to get a dict
+resp.json() does the loading from the text
+
",python
+"How to limit the amount of a string in a text fileI have some code that records an Input name and score and I want to make it so that there can only be 3 entries of that same name therefore I am trying to figure out how to check for a certain string in a file and make sure there is only 3 of that string. This is a code i previously found but I want to make it so once it recognises 3 names (if there are 3 names) kill the program.
+# checks for multiple name entries max 3
+def maxNamesA1():
+count = 0
+with open('testResultA1.txt') as myfile:
+ if name in myfile.read():
+ count += 1
+ print(count)
+
","You need to iterate through the lines to get a count, as of now you are only increasing the account for any match.
+# checks for multiple name entries max 3
+def maxNamesA1():
+ count = 0
+ with open('testResultA1.txt') as myfile:
+ count = sum([1 for line in myfile if name in line])
+
+ # to terminate
+ if count > 3: exit()
+ return count
+
",python
+"How do I add to a list in python using via inputsI'm a rookie at programming. I want to create an open list that I can add to via inputs. The if statement I created does not work. It prints the list even if I enter Y in my response to the second input. I'm sure there are many issues with this but if someone could tell me what I'm doing wrong or if there is a better alternative it would be greatly appreciated.
+tickers = []
+print(f'current tickers list {tickers}')
+f = input("please enter a ticker symbol you would like to watch:\n")
+tickers.append(f)
+q = input("would you like to add another ticker symbol Y or N: ")
+if q == 'Y':
+ print(f)
+ tickers.append(f)
+else:
+ print(f'Updated tickers list {tickers}')
+
","tickers = []
+
+while True:
+ print(f'current tickers list {tickers}')
+ f = input("please enter a ticker symbol you would like to watch:\n")
+ tickers.append(f)
+ q = input("would you like to add another ticker symbol Y or N: ")
+ if q == 'Y':
+ pass
+ else:
+ print(f'Updated tickers list {tickers}')
+ break
+
",python
+"How to have integer subtract from a string in python?I'm currently working on a chatbot from scratch as an assignment for my intro to python class. In my assignment I need to have at least one "mathematical component". I cant seem to find out how to have my input integer subtract from a string.
+Attached is a screen shot, My goal is to have them input how many days a week they cook at home and have that subtract from 7 automatically.
+print('Hello! What is your name? ')
+my_name = input ()
+print('Nice to meet you ' + my_name)
+print('So, ' + my_name + ' What is your favorite veggie?')
+favorite_veggie = input ()
+print('Thats nuts! ' +favorite_veggie + ' is mine too!')
+print('How many days a week do you have cook at home? ')
+day = input ()
+print('So what do you do the other ' + ????? 'days?')
+
","You are looking for this
+day = input()
+required_days = 7 - int(day)
+print('So what do you do the other ' + str(required_days) + ' days?')
+
",python
+"Numpy array indexing with another matrixI looked into other posts related to indexing numpy array with another numpy array, but still could not wrap my head around to accomplish the following:
+a = [[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]],
+b = [[[1,0],[0,1]],[[1,1],[0,1]]]
+a[b] = [[[7,8,9],[4,5,6]],[[10,11,12],[4,5,6]]]
+
+a is an image represented by 3D numpy array, with dimension 2 * 2 * 3 with RGB values for the last dimension. b contains the index that will match to the image. For instance for pixel index (0,0), it should map to index (1,0) of the original image, which should give pixel values [7,8,9]. I wonder if there's a way to achieve this. Thanks!
","Here's one way:
+In [54]: a = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
+
+In [55]: b = np.array([[[1, 0], [0, 1]], [[1, 1], [0, 1]]])
+
+In [56]: a[b[:, :, 0], b[:, :, 1]]
+Out[56]:
+array([[[ 7, 8, 9],
+ [ 4, 5, 6]],
+
+ [[10, 11, 12],
+ [ 4, 5, 6]]])
+
",python
+"getting python to print out the rows from my csv that are between 2 numbersI have code at the moment that prints out the rows of data from a csv based on user input, this code is displayed below:
+#allows user input to select a column and then select a value from that
+
+ column data = pd.read_csv('/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pip/Locations.csv')
+ rowcol = 0 #the colum that is being searched is column 0 row1 =
+ int(input("Enter Number: ")) #enter in your first point on the map Eg
+ 15 row2 = int(input("Enter Number: ")) #enter in your 2nd point on the
+ graph eg 18 result = data.iloc[[row1, row2]]
+
+But now I want my code to also print out the rows that are between these 2 values that are entered (e.g if the user puts in 12 and 15 it prints out the rows for 12, 13, 14 and 15)
+this is what I have at the moment but I'm not sure how to go further:
+num_list = []
+for i in range(row1+1, row2):
+ num_list.append(i)
+
+print(f'Numbers between 2 points are:\n{btwpoints}')
+
","You could use range:
+df.iloc[range(row1, row2)]
+
+If you need to include row2 as well:
+df.iloc[range(row1, row2 + 1)]
+
+You should manage obvious exceptions anyway (like row2 < row1, or out of bounds situations).
+Out of bounds situations could be mananged like this:
+df.iloc[range(max(0, row1), min(df.shape[0], row2+1))]
+
+Having row2 <= row1 will return an empty DataFrame which can be an acceptable output
",python
+"Python Selenium. Scraping web pageI want to get the data from the box inside 'Stock Style - Weight' from the url 'https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3' using Selenium
+This data is in an iframe. I´m able to switch to the iframe and click the button = 'Weight' but i can´t get the nine figures
+Below is my code
+driver = webdriver.Chrome(chromedriver)
+driver.get("https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3")
+
+iframe = WebDriverWait(driver, 10).until(
+ EC.presence_of_element_located((By.XPATH, "//iframe[@id='portfolio']")))
+driver.switch_to.frame(iframe)
+
+element1=driver.find_element_by_xpath('/html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[2]/div[2]')
+element2=element1.find_element_by_css_selector("input[type='radio'][value='Weight']").click()
+
+I´ve tried several options
+driver.find_element_by_xpath('*//div/div[2]/div/div[2]/div/svg/g/g[3]/g[2]/g[1]/text')
+
+driver.find_element_by_css_selector("mbc-chart-group> g.style-box-text-layer > g:nth-child(1)")
+
+but i get the same error
+NoSuchElementException: no such element: Unable to locate element
+
","The Elements are in svg and text tags. To access the same you need to use:
+//*[local-name()='svg'] or //*[name()='svg']
+
+Link to refer
+Xpath for those number would be:
+//div[@class='sal-stock-style__weight']//*[name()='svg' and @role='chart']//*[name()='g' and @class='style-box-text-layer']//*[name()='text']
+
+Try like below and confirm:
+numbers = driver.find_elements_by_xpath("//div[@class='sal-stock-style__weight']//*[name()='svg' and @role='chart']//*[name()='g' and @class='style-box-text-layer']//*[name()='text']")
+for num in numbers:
+ print(num.text)
+
+15
+6
+4
+22
+14
+2
+19
+13
+2
+
",python
+file structure and init method for webjob/function using pythonam a beginner in python and wants to test some code in webjob and function app. usually i write code using c# so in visual studio we have templates to create webjob/function app so that we get all required files and init code. Now using python i need the required file structure and init code.
,"The folder/file structure for a Python Functions project looks like:
+<project_root>/
+ | - .venv/
+ | - .vscode/
+ | - my_first_function/
+ | | - __init__.py
+ | | - function.json
+ | | - example.py
+ | - my_second_function/
+ | | - __init__.py
+ | | - function.json
+ | - shared_code/
+ | | - __init__.py
+ | | - my_first_helper_function.py
+ | | - my_second_helper_function.py
+ | - tests/
+ | | - test_my_second_function.py
+ | - .funcignore
+ | - host.json
+ | - local.settings.json
+ | - requirements.txt
+ | - Dockerfile
+
+init method for webjob/function using python
+init__.py
+import azure.functions as func
+import logging
+
+
+def main(req: func.HttpRequest,
+ obj: func.InputStream):
+logging.info(f'Python HTTP triggered function processed: {obj.read()}')
+
+Please follow developer guide of Azure functions using python and Python Azure Functions using VS Code.
",python
+"AttributeError: module ' ' has no attribute 'Command'In my Django project ,there is a file(module) which is used to load csv data.
+project/apps/patients/management/commands/load_patient_data.py
+
+Inside the file(module):
+import psycopg2
+import csv
+conn = psycopg2.connect(host='localhost', dbname='patientdb',user='username',password='password',port='')
+cur = conn.cursor()
+
+with open(r'apps/patients/management/commands/events.csv') as csvfile:
+ spamreader = csv.DictReader(csvfile, delimiter=',' ,quotechar=' ')
+ for row in spamreader:
+ cur.execute(f"""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES
+ ('{row['PATIENT ID']}','{row['EVENT TYPE']}','{row['EVENT VALUE']}',
+ '{row['EVENT UNIT']}','{row['EVENT TIME']}')""")
+conn.commit()
+
+When I run:
+ python manage.py load_patient_data
+
+Error:
+AttributeError: module 'apps.patients.management.commands.load_patient_data' has no attribute 'Command'
+
+Any friend can help?
","In load_patient_data.py file
+Write follwing
+from django.core.management.base import BaseCommand, CommandError
+
+class Command(BaseCommand):
+
+ def handle(self, *args, **options):
+ import psycopg2
+ import csv
+ conn = psycopg2.connect(host='localhost', dbname='patientdb', user='username', password='password', port='')
+ cur = conn.cursor()
+
+ with open(r'apps/patients/management/commands/events.csv') as csvfile:
+ spamreader = csv.DictReader(csvfile, delimiter=',', quotechar=' ')
+ for row in spamreader:
+ cur.execute(f"""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES
+ ('{row['PATIENT ID']}','{row['EVENT TYPE']}','{row['EVENT VALUE']}',
+ '{row['EVENT UNIT']}','{row['EVENT TIME']}')""")
+ conn.commit()
+
+Then simply run python manage.py load_patient_data
",python
+"Quadratic equation: TypeError: bad operand type for unary -: 'str'I´m new to Python and have been trying to code the quadratic equation, but i keep running into this error:
+TypeError: bad operand type for unary -: 'str'
+def quad_gleichung():
+a = input('a:')
+b = input('b:')
+c = input('c:')
+
+x1 = int(-b + (b**2 - (4*a*c))**(0.5)) / (2*a)
+x2 = int(-b - (b**2 - (4*a*c))**(0.5)) / (2*a)
+
+
+print('Lösung 1:', x1)
+print('Lösung 2:', x2)
+
+quad_gleichung()
+
+Can anyone please help me?
+Thanks!
","The return type of input is string. It needs to be converted to some number type, float or int, depending on use case.
+Therefore, change the assignments to a, b, c to:
+a = int(input("a: ")) # or float(input("a: "))
+b = int(input("b: ")) # or float(input("b: "))
+c = int(input("c: ")) # or float(input("c: "))
+
",python
+"In abaqus, how to output the maximum stress value at each time point without selecting an objectDuring the compression test, how to output the maximum stress value at each time point without selecting an object
+Step&question
+I created a field output,MISESMAX maximum mises equivalent stress-submit this job-creat XY data-click ODB field output- select MISESMAX(i chorse intergration point in this part)then I save,but abaqus hint “At least one entity should be selscted”.
+Goal
+I want to output Maximum stress value of each step in XY data,but the area of maximum stress value in each step will be different,so how to How to output XY value without selecting an area.
","I found an effective method in the help document
+Finding the maximum value of von Mises stress
+https://help.3ds.com/2020/english/dssimulia_established/simacaecmdrefmap/simacmd-c-odbintroexamaxmisespyc.htm?contextscope=all
",python
+"Scraping Hotel Info by using the existing list of urls in csv fileI have scraped urls of 3 hotel information pages from TripAdvisor and stored in a csv file. After importing the csv file, I have to use these 3 urls to scrape each hotel name, get the price range of each hotel and their hotel class. The tool of Selenium is used.
+
+Here is my code. When using the URL of single hotel, I can scrape the name of hotel. However, when it comes to a lot of hotels to scrape, it doesn't work. It seems there are problems in "for" loop.
+!pip install selenium
+
+from selenium import webdriver
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.common.keys import Keys
+import csv
+from time import sleep
+from time import time
+from random import randint
+
+browser = webdriver.Chrome(executable_path= 'C:\ProgramData\Anaconda3\Lib\site-packages\jupyterlab\chromedriver.exe')
+result_list=[]
+
+def start_request(q):
+ r = browser.get(q)
+ print("crlawling "+q)
+ return r
+
+def parse(text):
+ container1 = browser.find_elements_by_xpath('//*[@id="taplc_hotel_review_atf_hotel_info_web_component_0"]')
+ mydict = {}
+
+ for results in container1:
+ try:
+ mydict['name'] = results.find_element_by_xpath('//*[@id="HEADING"]')
+
+ except Exception as e:
+ print(e)
+ print('not____________________________found')
+ mydict['name'] = 'null'
+ result_list.append(mydict)
+
+with open('Best3HotelsLink.csv') as f:
+ reader = csv.DictReader(f)
+ for row in reader:
+ req = row['Link']
+ text = start_request(req)
+ parse(text)
+ sleep(randint(1,3))
+
+import pandas as pd
+df = pd.DataFrame(result_list)
+df.to_csv('Detailed Hotelinfo.csv')
+df
+
+I also have tried to scrape the hotel class and the price range of the hotels, but in vain.
+Hotel Class
+Price Range
+I would like to seek your advice on how to fix the above problems. Many thanks.
","if you have lot informations to scrape i suggest you to reload informations each time:
+try this code:
+def parse(text):
+ time.sleep(2) # i suggzest you to add some time to wait to load the page
+ container1 = browser.find_elements_by_xpath('//*[@id="taplc_hotel_review_atf_hotel_info_web_component_0"]')
+ nbrcontainer = len(container1)
+ mydict = {}
+
+ for i in range(0, nbrcontainer):
+ container1 = browser.find_elements_by_xpath('//*[@id="taplc_hotel_review_atf_hotel_info_web_component_0"]')
+ results = container1[i]
+ try:
+ mydict['name'] = results.find_element_by_xpath('//*[@id="HEADING"]')
+
+ except Exception as e:
+ print(e)
+ print('not____________________________found')
+ mydict['name'] = 'null'
+ result_list.append(mydict)
+
",python
+"Float of each element in a matrix python ::I am trying the convert the elements in a matrix to a float number, wanna the output to be 0.200, instead of 0.2 ?
+(as the numerical precision is not the same as in Matlab for example, and it affects the results on what i want ?
+When I tried float() I got the following error:
+"TypeError: only size-1 arrays can be converted to Python scalars"
+Any help,
+I attached the code:
+import numpy as np
+
+ A=np.array([[ 0.0186428, -0.0056, -0.0056, 0, 0,
+ 0],
+ [-.1263, 0.42087542, -.1263, 0, 0,
+ 0],
+ [-.1263, -.1263, 0.42087542, 0, 0,
+ 0],
+ [0, 0, 0, 0.2, -0,
+ 0 ],
+ [ 0, 0, 0, -0, 0.2,
+ 0 ],
+ [-0, -0, 0, 0, 0,
+ 0.2 ]])
+
+ B=np.array([[1,0,0,0,0,0],[0,1,0,0,0,0],[0,0,1,0,0,0],[0,0,0,1,0,0],[0,0,0,0,1,0],
+ [0,0,0,0,0,1]])
+
+ C=B*A*B # float(C) NOT working ?
+ print(C)
+
","Your variable C is already a numpy array with float valued numbers. You can check it yourself by printing
+In [23]: C.dtype
+Out[23]: dtype('float64')
+
+If you want to change how the numpy arrays are printed to the console, you can edit the settings with np.set_printoptions. For example:
+In [21]: np.set_printoptions(precision=3, floatmode='fixed')
+
+In [22]: C
+Out[22]:
+array([[ 0.019, -0.000, -0.000, 0.000, 0.000, 0.000],
+ [-0.000, 0.421, -0.000, 0.000, 0.000, 0.000],
+ [-0.000, -0.000, 0.421, 0.000, 0.000, 0.000],
+ [ 0.000, 0.000, 0.000, 0.200, 0.000, 0.000],
+ [ 0.000, 0.000, 0.000, 0.000, 0.200, 0.000],
+ [ 0.000, 0.000, 0.000, 0.000, 0.000, 0.200]])
+
+
+- The
precision of 3 sets the values to be printed with three digits of precision
+- The
floatmode of 'fixed' means:
+
+
+Always print exactly precision fractional digits, even if this would
+print more or fewer digits than necessary to specify the value
+uniquely.
+
+Note about multiplication
+Based on your comment, it seems that what you are trying to achieve is matrix multiplication of matrices A and B. The * operator is element-wise multiplication. For matrix multiplication, you would want to use np.matmul(np.matmul(B,A), B).
",python
+"How to fill NaN for categorical data randomly?I have a table like this one:
+
+
+
+
+| Sex |
+SchGend |
+
+
+
+
+| M |
+Boys |
+
+
+| F |
+Girls |
+
+
+| NaN |
+Mixed |
+
+
+| NaN |
+Boys |
+
+
+
+
+And I want to fill the NaNs values within this table (there are 100 hundred of them). The SchGend tells if the school is only for boys, only for girls or for both. Thus, to fill the 4th row I want to put M as the sex, but to fill the NaN for the mixed school I want to do it with random value. I have no idea on how to put a condition in the fillna method for pandas.
+So that is my question: how can I do that? Any tips?
","First, fill the values for known values from the school information. Then fill the remaining randomly.
+You can use random.choices to generate a random sequence of "M" and "F" (There should be alternative functions in numpy.random if you prefer).
+If you run the below, you will get different outcomes for the third record.
+from io import StringIO
+import random
+import pandas as pd
+
+data = """
+Sex SchGend
+M Boys
+F Girls
+NaN Mixed
+NaN Boys
+"""
+
+x = pd.read_csv(StringIO(data), sep="\t")
+
+# fill cases of boys or girls school
+x.loc[x.SchGend == "Boys", "Sex"] = "M"
+x.loc[x.SchGend == "Girls", "Sex"] = "F"
+
+num_na = x.Sex.isna().sum() # number of missing cases
+x.loc[x.Sex.isna(), "Sex"] = random.choices(["M", "F"], k=num_na)
+x
+
",python
+"Python threading.Semaphore vs asyncio.SemaphoreI recently noticed there are two different Semaphore implementations in different packages in python, one is in threading package and another is in asyncio package. And I am curious what is the difference between these two implementation? If in async function I try to use Semaphore from threading package, would that cause any potential problems?
+And by checking python official documentation, it wrote
+asyncio primitives are not thread-safe, therefore they should not be used for OS thread synchronization (use threading for that)
+
+But what does that mean by saying asyncio primitives are not thread-safe and should not be used for OS thread sync?
+Thanks in advance
","The whole goal of a semaphore is to provide exclusive access to something. Only one "piece of code" can access own the semaphore at any one time.
+What I mean by "piece of code" in the previous statement depends on whether I'm using multi-threading, multi-processing, or asyncio. And the means by which you guarantee exclusive access depends on what I'm using.
+Asyncio is the most restricted kind of multi-threading. Everything is running within a single Python thread. The Python interpreter is only executing one thing at a time. Each "piece of code" runs until it voluntarily waits for something to happen. Then another "piece of code" is allowed to run. Eventually the original piece of code runs again when the thing it was waiting on happens.
+With multithreading, multiple pieces of code are running within the Python interpreter. Only one piece of code runs at any time, but they are not politely waiting for each other. Python switches from "piece of code" to "piece of code" as it wants.
+With multiprocessing, multiple Pythons are running simultaneously.
+There is no sharing between the pieces of code, other than what is provided by the operating system. To set up a semaphore usually requires some support from the operating system to create a shared variable that all threads/processes can access.
+So. Asyncio primitives are designed so that they are all run within a single Python process with the processes cooperating. They are not designed to work if multiple pieces of code try to use it simultaneously.
+I hope this helps.
",python
+"To Find Vowel-Substring From a StringI have a string of lowercase English letters and an integer of the substring length. I have to find the substring of that length that contains the most vowels.
+Example:
+s = 'azerdii'
+k = 5
+The possible 5 character substrings are:
+
+- 'azerd' No. of vowels = 2
+- 'zerdi' No. of vowels = 2
+- 'erdii' No. of vowels = 3
+
+So the answer should be 'erdii'
+Here is my code:
+def findSubstring(s, k):
+ i = 0
+ lst = []
+ count = 0
+ tempL = []
+
+ while(i != len(s)):
+ a = i+k
+ temp = s[i:a]
+ lst.append(temp)
+ if a != len(s):
+ i+=1
+ else:
+ break
+
+ for word in lst:
+ for alphabet in word:
+ if alphabet in 'aeiou':
+ count += 1
+ tempL.append(count)
+ print(lst)
+ print(tempL)
+ return
+
+s = input()
+
+k = int(input().strip())
+
+print(findSubstring(s, k))
+
+I'm getting
+
+['azerd', 'zerdi', 'erdii']
+[2, 4, 7]
+
+But the count should be
+[2, 2, 3]
+
+Please forgive any stupid errors I may have. I will certainly appreciate any help.
","Try the following:
+def find_substr(s, k):
+ substrings = [s[i:i+k] for i in range(len(s) - k + 1)] # list of substrings
+ # vowels = [sum(c in 'aeiou' for c in s) for s in substrings]
+ # print(vowels) # [2, 2, 3]
+ return max(substrings, key=lambda s: sum(c in 'aeiou' for c in s))
+
+print(find_substr('azerdii', 5)) # 'erdii'
+
+If you un-comment the lines that are commented out, you will see the number of vowels is correctly computed as [2, 2, 3].
+
+Here, sum(c in 'aeiou' for c in s) gets the number of vowels in a string s, which is equivalent to
+count = 0
+for alphabet in word:
+ if alphabet in 'aeiou':
+ count += 1
+
+which in turn is the same as your code, except the line count = 0. After processing each word, you need to reset count. So try adding count = 0 in your code.
",python
+"How to run seperate folder of pytests in VScodeI'm trying to run a pytests for my code in VSCode from a separate folder, but I keep getting the following error:
+ModuleNotFoundError: No module named 'src'
+
+My basic file structure is:
+Root
+|
+|- src
+| |- file1.py
+|
+|- tests
+| |- test_file.py
+
+My code in test_file.py is:
+from src import file1
+
+I saw How to run tests without installing package? and all of the millions of other import questions but I still can't figure it out. I've also tried
+from ..src import file1
+
+but then I get an ImportError: attempted relative import beyond top-level package
+How do I use imports in this way to run tests on my code?
","You can modify the PYTHONPATH to avoid this problem. You can refer to the official docs.
+
+An example of when to use PYTHONPATH would be if you have source code
+in a src folder and tests in a tests folder. When running tests,
+however, those tests can't normally access modules in src unless you
+hard-code relative paths.
+To solve this problem, you could add the path to src to PYTHONPATH by
+creating an .env file within your VS Code workspace.
+PYTHONPATH=src Then set python.envFile in your settings.json file to
+point to the .env file you just created. For example, if the .env file
+was in your workspace root, your settings.json would be set as shown:
+"python.envFile": "${workspaceFolder}/.env"
+
+And the Test is not routed through the terminal, so setting the terminal.integrated.env.* will not work.
",python
+"Count the number of atoms of each elementI have to access a file and count the number of atoms of each element. That is, count the number of times of the last character.
+For example, I have a file named 14ly.pdb with the following lines:
+ATOM 211 N TYR A 27 4.697 8.290 -3.031 1.00 13.35 N
+
+ATOM 212 CA TYR A 27 5.025 8.033 -1.616 0.51 11.29 C
+
+ATOM 214 C TYR A 27 4.189 8.932 -0.730 1.00 10.87 C
+
+ATOM 215 O TYR A 27 3.774 10.030 -1.101 1.00 12.90 O
+
+I should get as a result: 'N':1, 'C':2, 'O':1, that is, 1 atom of type N, 2 of type C and 1 of type O.
+I have the following incomplete code that I need to complete:
+import os
+
+def count_atoms(pdb_file_name):
+ num_atoms = dict()
+ with open(pdb_file_name) as file_content:
+
+##Here should go the code I need##
+
+return num_atoms
+result = count_atoms('14ly.pdb')
+print(result)
+
","number_of_atoms = dict()
+with open("14ly.pdb", "r") as f:
+ for line in f:
+ line_words = line.split(" ")
+ last_char = line_words[-1].rstrip('\n')
+ if last_char in number_of_atoms.keys():
+ number_of_atoms[last_char] += 1
+ else:
+ number_of_atoms[last_char] = 1
+print(number_of_atoms)
+
+I think this should be enough
",python
+"Specifying steps in y-axis of Pandas plot()I use the following plot function to create a line chart of a Pandas dataframe
+row = df.iloc[0].astype(int)
+plt.subplot(1, 2, 1)
+row.plot(marker='o', fontsize=20, ylabel=yax_label)
+plt.show()
+
+Problem is that, the y steps are shown in float (0.5 steps). Is there any way to control that? For example, 3,4,5,6,7 as integers.
+![]()
","You can use .set_yticks():
+(I can't run your code, send an example)
+import matplotlib.pyplot as plt
+
+fig, axe = plt.subplots(1, 3, constrained_layout=True)
+
+axe[0].plot(range(10))
+axe[0].set_yticks(np.arange(0,10,0.5))
+
+axe[1].plot(range(10))
+axe[1].set_yticks(np.arange(0,10))
+
+axe[2].plot(range(10))
+axe[2].set_yticks(np.arange(0,10,2))
+plt.show()
+
+Output:
+![]()
",python
+"How can I enumerate and add margin_titles to each subplot in a seaborn lmplot facetgrid?I have the following attached lmplot facetgrid ![]()
+To start with, I want to simplify the title of each subplot, to only have corpus = {corpus name}.
+I am generating these plots using the lmplot as per
+g=sns.lmplot('x', 'y', data=test_plot, col='corpus', hue = 'monotonicity', row='measure', sharey=True, sharex=True, height=2.5,aspect=1.25, truncate=False, scatter_kws={"marker": "D", "s": 20})
+
+g=(g.set_axis_labels("Max-Min (measure)", "Max-Min (comp measure)")
+ .set(xlim=(0, 1), ylim=(-.1, 1))
+ .fig.subplots_adjust(wspace=.02))
+
+I want to use the facetgrid margin_title option to put the measure value on the right y-axis, but get lmplot() got an unexpected keyword argument 'margin_titles'
+I then tried using a facetgrid, as per:
+p = sns.FacetGrid(data = test_plot,
+ col = 'corpus',
+ hue = 'monotonicity',
+ row = 'measure',
+ margin_titles=True)
+
+
+p.map(sns.lmplot, 'diff_', 'score_diff', data=test_plot, he='monotonicity', truncate=False, scatter_kws={"marker": "D", "s": 20})
+
+but then I get an error about lmplot() got an unexpected keyword argument 'color' (cannot figure out why that is being thrown?).
+My second problem is that I want to add a letter/enumeration to each subplot's title, as in (a), ..., (i), but for the life of me cannot figure out how to do this.
","Because of your custom needs, consider iterating through all the axes of the FacetGrid after running your lmplot. Regarding your specific error, seaborn.lmplot is a FacetGrid so will conflict if nested in another FacetGrid as tried in your second attempt. Also, in below solution, do not re-assign g to axes setup which returns NoneType:
+#... SAME lmplot ...
+
+(
+ g.set_axis_labels("Max-Min (measure)", "Max-Min (comp measure)")
+ .set(xlim=(0, 1), ylim=(-.1, 1))
+ .fig.subplots_adjust(wspace=.02)
+)
+
+alpha = list('abcdefghijklmnopqrstuvwxyz')
+axes = g.axes.flatten()
+
+# ADJUST ALL AXES TITLES
+for ax, letter in zip(axes, alpha[:len(axes)]):
+ ttl = ax.get_title().split("|")[1].strip() # GET CURRENT TITLE
+ ax.set_title(f"({letter}) {ttl}") # SET NEW TITLE
+
+# ADJUST SELECT AXES Y LABELS
+for i, m in zip(range(0, len(axes), 3), test_plot["measure"].unique()):
+ axes[i].set_ylabel(m)
+
+Input (purely random data for demonstration)
+import numpy as np
+import pandas as pd
+
+np.random.seed(1172021)
+
+test_plot = pd.DataFrame({
+ 'measure': np.random.choice(["precision", "recall", "F1-score"], 500),
+ 'corpus': np.random.choice(["Fairview", "i2b2", "MiPACQ"], 500),
+ 'monotonicity': np.random.choice(["increasing", "non", "decreasing"], 500),
+ 'x': np.random.uniform(0, 1, 500),
+ 'y': np.random.uniform(0, 1, 500)
+})
+
+
+Output
+![]()
",python
+"Given an array arr of size n and an integer X. Find if there's a triplet in the array which sums up to the given integer XGiven an array arr of size n and an integer X. Find if there's a triplet in the array which sums up to the given integer X.
+ Input:
+ n = 5, X = 10
+ arr[] = [1 2 4 3 6]
+
+ Output:
+ Yes
+
+ Explanation:
+ The triplet {1, 3, 6} in
+ the array sums up to 10.
+
","the line of reasoning is:
+Get all the possible combinations of 3 numbers in the array arr. Find which has sum=X, print only these triplets
+import numpy as np
+import itertools
+arr=np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
+X=10
+combinations=np.array(list(itertools.combinations(arr, 3)))
+triplets=combinations[combinations.sum(axis=1)==X]
+
+print(f'Triplets with sum equal to {X} are:\n{triplets}')
+
+output:
+Triplets with sum equal to 10 are:
+[[0 1 9]
+ [0 2 8]
+ [0 3 7]
+ [0 4 6]
+ [1 2 7]
+ [1 3 6]
+ [1 4 5]
+ [2 3 5]]
+
",python
+"Problem with rotating a two-dimensional array in pythonHow to to make this array rotate by 90 degrees to right without using numpy.
+multiarray = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
+auxiliaryArray = []
+colLength = len(multiarray[0])
+rowLength = len(multiarray)
+for indexRow in range(len(multiarray)):
+ for indexCol in range(len(multiarray[0])):
+ auxiliaryArray[indexCol][rowLength - 1 - indexRow] = multiarray[indexRow][indexCol]
+
+print(auxiliaryArray)
+
+Error:
+IndexError: list index out of range
+Desired Output: [[7, 4, 1], [8, 5, 2], [9, 6, 3]]
","You can use zip on the reversed array:
+auxiliaryArray = list(zip(*multiarray[::-1]))
+
+or
+auxiliaryArray = list(zip(*reversed(multiarray)))
+
+output: [(7, 4, 1), (8, 5, 2), (9, 6, 3)]
+If you need lists and not tuples:
+auxiliaryArray = list(map(list, zip(*reversed(multiarray))))
+
+output: [[7, 4, 1], [8, 5, 2], [9, 6, 3]]
",python
+"Manipulating Dataframes in different sub directoriesI have many subdirecotries in which I have unique datasets. I want to do some manipulations on this df individually. Something like: Access to each subdirectory, do manipulation, go to next directory and do the same. For illustrative purposes I can provide the code:
+import pandas as pd
+import numpy as np
+import os
+
+
+os.mkdir('folder1')
+
+d = {'column1': ['a', 'a', 'b', 'b', 'c'], 'column2': [10, 8, 6, 4, 2], 'column3': [1, 2, 3, 4, 5]}
+test_a = pd.DataFrame(data=d)
+test_a.to_csv('folder1/test_a.csv')
+
+os.mkdir('folder2')
+g = {'column1': ['a', 'a', 'b', 'b', 'c'], 'column2': [10, 8, 6, 4, 2], 'column3': [1, 2, 3, 4, 5]}
+test_b = pd.DataFrame(data=g)
+test_b.to_csv('folder2/test_b.csv')
+
+The code above creates the subdirectories and then saves example df in this subdirectory.
+Let's say I want to achieve the following:
+Grouby (count) each dataset in each folder by column1, and save it in the corresponding subdirectory as a separate data frame. Better to call each data frame by the starting letters (test in this case), rather than its extension (csv).
+I can write the general function on how to grouby the datasets, but I don't know how to access each subdirectory. (probably using the for loop and os/glob package).
+Thanks in advance.
","Use pathlib:
+import pandas as pd
+import pathlib
+
+# directory where data files are stored
+data_dir = pathlib.Path('data')
+
+for csvfile in data_dir.glob('**/*.csv'):
+ print(f"Processing '{csvfile.name}' in '{csvfile.parent}'")
+ df = pd.read_csv(csvfile)
+ # do stuff here
+ out = df.groupby('column1').mean() # mean or whatever you want
+ out.to_csv(csvfile.parent / f"{csvfile.stem}_grp.csv")
+ print(f"Saved as '{csvfile.stem}_grp.csv' in '{csvfile.parent}'")
+ print()
+
+Output:
+Processing 'test_a.csv' in 'data/folder1'
+Saved as 'test_a_grp.csv' in 'data/folder1'
+
+Processing 'test_b.csv' in 'data/folder2'
+Saved as 'test_b_grp.csv' in 'data/folder2'
+
+Directory tree:
+data
+├── folder1
+│ ├── test_a.csv
+│ └── test_a_grp.csv
+└── folder2
+ ├── test_b.csv
+ └── test_b_grp.csv
+
",python
+"How can I take a list input from a file and turn it into a dictionary?So I've been looking all over StackOverflow for a way to properly solve my issue, but I haven't found anything suitable.
+I am taking in a file that has words with associated values in the format of:
+alone,1
+amazed,10
+amazing,10
+bad,1
+etc.
+
+I am taking in this text file and reading the lines, which breaks each of the lines into a list. I then need to transfer this list into a dictionary type, where the keyword and value are kept associated.
+I found a solution for this problem in another question, but it currently gives an output that includes \n within the value part of the dictionary.
+Here is the code:
+keywords_file = open('keywords.txt')
+keywords = keywords_file.readlines()
+
+def keyword_to_dictionary(keywords):
+ result = [{}]
+ for item in keywords:
+ key, val = item.split(",", 1)
+ if key in result[-1]:
+ result.append({})
+ result[-1][key] = val
+ return result
+
+Output:
+[{'alone': '1\n', 'amazed': '10\n', 'amazing': '10\n', 'bad': '1\n', 'best': '10\n',
+'better': '10\n', 'excellent': '10\n', 'excited': '10\n', 'excite': '10\n', 'excites':
+'10\n', 'exciting': '10\n', 'glad': '10\n', 'god': '5\n', 'good': '7\n', 'great': '7\n',
+'greatest': '10\n', 'haha': '5\n', 'hahaha': '5\n', 'happy': '10\n', 'hate': '1\n',
+'hurt': '1\n', 'hurts': '1\n'}]
+
+The output is longer but I hope that gives a decent idea of what is happening. I understand the issue, but I'm not sure how to go about solving it.
+To give some context in how this is going to be used, I have a comp sci assignment that requires me to equate a few thousand lines of twitter data with keywords in this file, outputting a sort of happiness level average that is going to be associated with the geographical area that the tweet was sent in, determined by timezones. I need to be able to access the keywords and equated value in order to do this and figured a dictionary was the best way to do that.
+context tldr: need to access the keyword and value associated with it when iterating through thousands of lines of Twitter data.
+Any help would be greatly appreciated.
+Apologies if there is anything wrong with this post or more information is required which I didn't provide.
+Thanks in advance.
","You can use a dictionary comprehension:
+with open('keywords.txt') as f:
+ result = {k: int(v) for line in f for k,v in [line.strip().split(',')]}
+
",python
+"Python project - Writing contents of .txt file to Pandas dataframeI'm currently working on a Python project where I want to:
+
+- Loop through subdirectories of a root directory
+- Find .txt files with names starting with 'memory_'. Txt files are:
+newline-separated, lines consist of: 'colName: Value' pairs. Like this.
+
+Memory dump
+
+Serialnr: 1412b23990
+
+Date/time: 24-11-2016 08:10
+
+
+
+mode: version
+
+Hardware release: ic2kkit01*P131113*
+
+Software release: V3.82
+
+Rom test 1 checksum: e0251fda
+
+Rom test 2 checksum: cae0351f
+
+
+
+mode: statistics
+
+Line power connected (hours): 360
+
+Line power disconnected (number of times): 2
+
+Ch function(hours): 54
+
+Dhw function(hours): 4
+
+Burnerstarts (number of times): 604
+
+Ignition failed (number of times): 0
+
+Flame lost (number of times): 0
+
+Reset (number of times): 0
+
+
+
+mode: status
+
+T1: 17.42
+
+T2: 17.4
+
+T3: 16.38
+
+T4: -35.0
+
+T5: -35.0
+
+T6: 17.4
+
+Temp_set: 0.0
+
+Fanspeed_set: 0.0
+
+Fanspeed: 0.0
+
+Fan_pwm: 0.0
+
+Opentherm: 0
+
+Roomtherm: 0
+
+Tap_switch: 0
+
+
+
+- Appending the contents of the .txt file to a Pandas data frame with predefined column names. I.e.: I would like to write each .txt file into a data frame row, using the colName:Value pairs. See attached image for how (a part of) the data frame should look like.
+
+Current .py code:
+import os
+import pandas as pd
+
+# Set rootdir for os.walk
+rootdir = 'K:/Retouren'
+
+## Create empty Pandas dataframe with just column names
+column_names = ["Memory dump", "Serialnr", "Date/time", "mode", "Hardware release", "Software release", "Rom test 1 checksum", "Rom test 2 checksum",
+ "mode", "Line power connected (hours)", "Line power disconnected (number of times)", "Ch function(hours)", "Dhw function(hours)", "Burnerstarts (number of times)",
+ "Ignition failed (number of times)", "Flame lost (number of times)", "Reset (number of times)", "Gasmeter_ch", "Gasmeter_dhw", "Watermeter", "Burnerstarts_dhw",
+ "mode", "T1", "T2", "T3", "T4", "T5", "T6", "Temp_set", "Fanspeed_set", "Fanspeed", "Fan_pwm", "Opentherm", "Roomtherm", "Tap_switch", "Gp_switch", "Pump", "Dwk",
+ "Gasvalve", "Io_signal", "Spark", "Io_curr", "Displ_code", "Ch_pressure", "Rf_rth_bound", "Rf_rth_communication", "Rf_rth_battery_info", "Rf_rth_battery_ok",
+ "Bc_tapflow", "Pump_pwm", "Room_override_zone1", "Room_set_zone1", "Room_temp_zone1", "Room_override_zone2", "Room_set_zone2", "Room_temp_zone2", "Outside_temp",
+ "Ot_master_member_id", "Ot_therm_prod_version", "Ot_therm_prod_type", "mode", "Node nr", "Cloud id0", "Cloud id1", "Cloud id2", "Rf cloud nr", "Rssi_lower",
+ "Rssi_upper", "Rssi_wait", "Attention_period", "Attention_number", "Info10", "Info11", "Info12", "Info13", "Info14", "Info15", "Info16", "Info17", "Info18",
+ "Ramses_thermostat_idh", "Ramses_thermostat_idm", "Ramses_thermostat_idl", "Ramses_boiler_idh", "Ramses_boiler_idm", "Ramses_boiler_idl", "Prod. token",
+ "Year", "Month", "Line number", "Serial1", "Serial2", "Serial3", "mode", "Id_dongle0", "Id_dongle1", "Id_dongle2", "Id_dongle3", "Id_lan0", "Id_lan1",
+ "Id_lan2", "Id_lan3", "Info2_7", "Info2_8", "Info2_9", "Info2_10", "Info2_11", "Info2_12", "Info2_13", "Info2_14", "mode", "Interrupt_time",
+ "Interrupt_load (%)", "Main_load (%)", "Net fequency (hz)", "Voltage ref. (v)", "Checksum1", "Checksum2", "nmode", "Fault 0", "Fault 1", "Fault 2",
+ "Fault 3", "Fault 4", "Fault 5", "Fault 6", "Fault 7", "Fault 8", "Fault 9", "Fault 10", "Fault 11", "Fault 12", "Fault 13", "Fault 14", "Fault 15",
+ "Fault 16", "Fault 17", "Fault 18", "Fault 19", "Fault 20", "Fault 21", "Fault 22", "Fault 23", "Fault 24", "Fault 25", "Fault 26", "Fault 27", "Fault 28",
+ "Fault 29", "Fault 30", "Fault 31", "mode", "Heater_on", "Comfort_mode", "Ch_set_max", "Dhw_set", "Eco_days", "Comfort_set", "Dhw_at_night", "Ch_at_night",
+ "Parameter 1", "Parameter 2", "Parameter 3", "Parameter 4", "Parameter 5", "Parameter 6", "Parameter 7", "Parameter 8", "Parameter 9", "Parameter a",
+ "Parameter b", "Parameter c", "Parameter c", "Parameter d", "Parameter e", "Parameter e.", "Parameter f", "Parameter h", "Parameter n", "Parameter o",
+ "Parameter p", "Parameter r", "Parameter f.", "mode", "Param31", "Param32", "Param33", "Param34", "Param35", "Param36", "Param37", "Param38", "Param39",
+ "Param40", "Param41", "Param42", "Param43", "Param44", "Param45", "Param46", "Param47", "Param48", "Param49", "Param50", "Param51", "Param52", "Param53",
+ "Param54", "Param55", "Param56", "Param57", "Param58", "Param59", "Param60", "Param61", "Param62", "Param63"]
+data = pd.DataFrame(columns = column_names)
+
+for subdir, dirs, files in os.walk(rootdir):
+ for file in files:
+ if file.startswith('memory_') and os.path.splitext(file)[1] == '.txt':
+ filepath = os.path.join(subdir, file)
+ with open (filepath, "r") as curfile:
+ data.append()
+ ## Here is where I would like to append the .txt data as a row in the data frame
+
+
+I have the first two steps down, but the third is exceeding my programming knowledge. Any tips would be greatly appreciated.
+Example of the desired dataframe:
+![]()
","I suggest reading the file with readlines(), which will return a list of lines. Then loop over the lines and process only those that contain : in the string. Split by the colon (and trailing whitespace) while wrapping everything in dict() will create a dictionary with the strings before the colon as keys and the strings after the colons as values:
+dict(i.split(': ',1) for i in curfile.readlines() if ':' in i)
+
+for your sample data this would make:
+{'Serialnr': '1412b23990', 'Date/time': '24-11-2016 08:10', 'mode': 'status', 'Hardware release': 'ic2kkit01*P131113*', 'Software release': 'V3.82', 'Rom test 1 checksum': 'e0251fda', 'Rom test 2 checksum': 'cae0351f', 'Line power connected (hours)': '360', 'Line power disconnected (number of times)': '2', 'Ch function(hours)': '54', 'Dhw function(hours)': '4', 'Burnerstarts (number of times)': '604', 'Ignition failed (number of times)': '0', 'Flame lost (number of times)': '0', 'Reset (number of times)': '0', 'T1': '17.42', 'T2': '17.4', 'T3': '16.38', 'T4': '-35.0', 'T5': '-35.0', 'T6': '17.4', 'Temp_set': '0.0', 'Fanspeed_set': '0.0', 'Fanspeed': '0.0', 'Fan_pwm': '0.0', 'Opentherm': '0', 'Roomtherm': '0', 'Tap_switch': '0'}
+
+If you create an empty list before the loop, and append the dictionaries to that list within the loop, you'll end up with a list of dicts that you can load with pandas in one go:
+import os
+import pandas as pd
+
+# Set rootdir for os.walk
+rootdir = 'K:/Retouren'
+
+## Create empty list
+data = []
+
+for subdir, dirs, files in os.walk(rootdir):
+ for file in files:
+ if file.startswith('memory_') and os.path.splitext(file)[1] == '.txt':
+ filepath = os.path.join(subdir, file)
+ with open (filepath, "r") as curfile:
+ data.append(dict(i.split(': ',1) for i in curfile.readlines() if ':' in i))
+
+df = pd.DataFrame(data)
+
+An added advantage is that you don't need to set the column names manually, because pandas will use the dict keys for that. DataFrame:
+
+
+
+
+ |
+Serialnr |
+Date/time |
+mode |
+Hardware release |
+Software release |
+Rom test 1 checksum |
+Rom test 2 checksum |
+Line power connected (hours) |
+Line power disconnected (number of times) |
+Ch function(hours) |
+Dhw function(hours) |
+Burnerstarts (number of times) |
+Ignition failed (number of times) |
+Flame lost (number of times) |
+Reset (number of times) |
+T1 |
+T2 |
+T3 |
+T4 |
+T5 |
+T6 |
+Temp_set |
+Fanspeed_set |
+Fanspeed |
+Fan_pwm |
+Opentherm |
+Roomtherm |
+Tap_switch |
+
+
+
+
+| 0 |
+1412b23990 |
+24-11-2016 08:10 |
+status |
+ic2kkit01P131113 |
+V3.82 |
+e0251fda |
+cae0351f |
+360 |
+2 |
+54 |
+4 |
+604 |
+0 |
+0 |
+0 |
+17.42 |
+17.4 |
+16.38 |
+-35 |
+-35 |
+17.4 |
+0 |
+0 |
+0 |
+0 |
+0 |
+0 |
+0 |
+
+
+
+
+There is one disadvantage: as a dict can only contain unique keys you will lose two mode values. I'll leave it as they seem to be headers rather than containers of information. Otherwise it would require some additional renaming.
",python
+"Poetry installed but `poetry: command not found`I've had a million and one issues with Poetry recently.
+I got it fully installed and working yesterday, but after a restart of my machine I'm back to having issues with it ;(
+Is there anyway to have Poetry consistently recognised in my Terminal, even after reboot?
+
+System Specs:
+
+- Windows 10,
+- Visual Studio Code,
+- Bash - WSL Ubuntu CLI,
+- Python 3.8.
+
+
+Terminal:
+me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ poetry run python3 cli.py
+poetry: command not found
+me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python3
+Retrieving Poetry metadata
+
+This installer is deprecated. Poetry versions installed using this script will not be able to use 'self update' command to upgrade to 1.2.0a1 or later.
+Latest version already installed.
+me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ poetry run python3 cli.py
+poetry: command not found
+me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$
+
+Please let me know if there is anything else I can add to post to help further clarify.
","When I run this, after shutdown of bash Terminal:
+export PATH="$HOME/.poetry/bin:$PATH"
+
+poetry command is then recognised.
+However, this isn't enough alone; as every time I shutdown the terminal I need to run the export.
+Possibly needs to be saved in a file.
",python
+"How to re-arrange only certain columns of a Pyspark Dataframe without listing out all the column names?I have a Pyspark Dataframe having 100 columns (shown only 5 below for explaining):
+![]()
+I need to re-arrange the index of around 30 specific columns only, and leave the rest as it is.
+The sequence (i.e. index) in which the specific columns need to be arranged is listed out in an Excel table as below (shown only 3 below for explaining):
+![]()
+Now, I could have used df = df.select('C', 'E', 'B',...and so on)
+But it becomes too tedious to write down all 100 column names in the correct sequence above.
+So is there any efficient way to do this by simply reading the sequence from the table as a list and using it as a reference to do the re-arrangement of columns?
+To continue with the example to be clear, if there were just these 5 columns out of which 3 had to be re-arranged, then the output would like this:
+![]()
+Note: I'm using Python 2.7.5 & Spark 2.4.0
","you can access all the columns with the property df.columns (which is a python list) and simply plays with it in a list comprehension :
+# assuming you can create the object sequence from the excel
+sequence = ["C", "E", "B"]
+
+df = df.select(sequence + [col for col in df.columns if col not in sequence])
+
+df
+> DataFrame[C: string, E: string, B: string, A: string, D: string]
+
",python
+"How do i remove the .0 from an integer?I have an integer in my code and want to connect it with other integers.
+str1 = ''
+int1 = 4
+int2 = 16
+int3 = 32
+int4 = 64
+
+int5 = int4 / int3
+int6 = int2 / int1
+
+str1 += str(int5)
+str1 += str(int6)
+
+print(str1)
+
+now the output would be 2.04.0, but i want it to be 24. How do i remove the .0?
","You need to get values as int so you need to change these two lines:
+int5 = int4 // int3
+int6 = int2 // int1
+
+Then you can use f-string:
+print(f'{int5}{int6}')
+
+Or use your own code:
+str1 += str(int5)
+str1 += str(int6)
+print(str1)
+
",python
+"Write a function to reduce duplicate codeI have two very similar for loops, I want to have an inner function to reduce the duplicate codes, they look like this:
+team_members = TeamMember.objects.all()
+managers = Manager.objects.all()
+for m in managers:
+ name = f"{m.name.first_name} {m.name.last_name}"
+//reset of the code are the same
+for t in team_members:
+ name = f"{t.member.first_name} {t.member.last_name}"
+//reset of the code are the same
+
+So the problem is managers and team_members querysets have different field names for people's names.
+If I want to write an inner function, how to solve the different field names?
","Here is the inner function which will take objs as input and fetch the values based on objects attribute.
+def inner_fun(objs):
+ for obj in objs:
+ if hasattr(obj, 'name'):
+ name_obj = getattr(obj, 'name')
+ else:
+ name = getattr(obj, 'member')
+ name = f"{name_obj.first_name} {name_obj.last_name}"
+ return name
+
+team_members = TeamMember.objects.all()
+managers = Manager.objects.all()
+
+team_name = inner_fun(team_members)
+manager_name = inner_fun(managers)
+
",python
+"Pandas - Loop through sheetsI have 5 sheets and created a script to do numerous formatting, I tested it per sheet, and it works perfectly.
+import numpy as np
+import pandas as pd
+
+FileLoc = r'C:\T.xlsx'
+Sheets = ['Alex','Elvin','Gerwin','Jeff','Joshua',]
+
+df = pd.read_excel(FileLoc, sheet_name= 'Alex', skiprows=6)
+df = df[df['ENDING'] != 0]
+df = df.head(30).T
+df = df[~df.index.isin(['Unnamed: 2','Unnamed: 3','Unnamed: 4','ENDING' ,3])]
+df.index.rename('STORE', inplace=True)
+df['index'] = df.index
+
+df2 = df.melt(id_vars=['index', 2 ,0, 1] ,value_name='SKU' )
+df2 = df2[df2['variable']!= 3]
+
+df2['SKU2'] = np.where(df2['SKU'].astype(str).fillna('0').str.contains('ALF|NOB|MET'),df2.SKU, None)
+df2['SKU2'] = df2['SKU2'].ffill()
+df2 = df2[~df2[0].isnull()]
+df2 = df2[df2['SKU'] != 0]
+
+df2[1] = pd.to_datetime(df2[1]).dt.date
+df2.to_excel(r'C:\test.xlsx', index=False)
+
+but when I assigned a list in Sheet_name = Sheets it always produced an error KeyError: 'ENDING'. This part of the code:
+Sheets = ['Alex','Elvin','Gerwin','Jeff','Joshua',]
+df = pd.read_excel(FileLoc,sheet_name='Sheets',skiprows=6)
+
+Is there a proper way to do this, like looping?
+My expected result is to execute the formatting that I have created and consolidate it into one excel file.
+NOTE: All sheets have the same format.
+![]()
","In using the read_excel method, if you give the parameter sheet_name=None, this will give you a OrderedDict with the sheet names as keys and the relevant DataFrame as the value. So, you could apply this and loop through the dictionary using .items().
+The code would look something like this,
+dfs = pd.read_excel('your-excel.xlsx', sheet_name=None)
+for key, value in dfs.items():
+ # apply logic to value
+
+If you wish to combine the data in the sheets, you could use .append(). We can append the data after the logic has been applied to the data in each sheet. That would look something like this,
+combined_df = pd.DataFrame()
+dfs = pd.read_excel('your-excel.xlsx', sheet_name=None)
+for key, value in dfs.items():
+ # apply logic to value, which is a DataFrame
+ combined_df = combined_df.append(sheet_df)
+
",python
+"How do I make an if statement in with multiple parameter using excel columns in python csvSo basically I am trying to write a statement where if a value in Col1 = 1 or Col2 = 1 than create a new column with the value 10 and if both Col1 and Col2 = 0 the new column should print 0 or just skip.
+so far this is what I did
+if df.Col1 == 1 or df.Col2 == 1:
+ df['newCol'] = 10
+else: pass
+
+This is giving me an error.
","You can do this:
+df['newCol'] = np.where(((df['Col1']==1)|(df['Col2']==1)), 10,np.nan)
+
",python
+"Field data disappears in Django REST frameworkI have a field "plugins" (see below) in my serializer and this is a serializer which also contains a file upload which is why the MultiPartParser is used. My view is pretty much standard, and the plugins field data also shows up in the request.data, however it doesn't show up in the validated_data of the serializer. To bring a minimalistic example, this would be my serializer:
+class CreationSerializer(serializers.ModelSerializer, FileUploadSerializer):
+ plugins = serializers.ListSerializer(
+ child=serializers.CharField(), required=False, write_only=True)
+
+ class Meta:
+ fields = ['plugins'] + FileUploadSerializer.Meta.fields
+ model = Company
+
+ def create(self, validated_data):
+ print(validated_data)
+
+While this would be my views.py:
+@swagger_auto_schema(request_body=CreationSerializer(), responses={201: CreationSerializer()}, operation_id='the_post')
+def create(self, request, *args, **kwargs):
+ print(request.data)
+ return super().create(request, *args, **kwargs) # which uses mixins.CreateModelMixin
+
+I tried adding another parser (i.e. JSONParser) to the parsers list, but this doesn't change anything.
","Does it work if you replace with this? I'm not sure but maybe drf doesn't recognize ListSerializer as a field, I've always used a Serializer with many=True:
+plugins = serializers.ListField(child=serializers.CharField(), required=False, write_only=True)
+
",python
+"TypeError: can only concatenate str (not ""float"") to strmy code:
+def média_harmonica(x,y):
+ média_harmonica = 2/((1/x)+(1/y))
+ return média_harmonica
+
+x = float(input("Informe um número para x: "))
+y = float(input("Informe um número para y: "))
+
+média_harmonica = (2/((1/x)+(1/y)))
+mensagem = "A média harmonica de "+x+" e "+y+" é: "+float(média_harmonica)
+print(mensagem)
+
","As the error suggests, you cannot concatenate a float to a string. You can add that variable to the string in a few ways:
+
+F-strings:
+mensagem = f"A média harmonica de {x} e {y} é: {média_harmonica}"
+
+casting: mensagem = "A média harmonica de " + str(x) + " e " + str(y) + " é:" + str(média_harmonica)
+
+
",python
+"Genetic algorithm: problem of convergenceI'm trying to find the solution of the one-max problem with a genetic algorithm, but it is not converging, instead the maximum fitness is getting lower. I can't see why it's not working; I tried to execute the functions on their own and they worked, I'm not sure about the calling in the main though.the one max problem is when you have a population N of binary individuals (1/0) of length m, and you want to optimize your population so you generate at least one individual containing only 1s (in my case 0s)
+Here's my code:
+import random
+
+def fitness(individual):
+ i = 0
+ for m in individual:
+ if m == 0:
+ i += 1
+ return i
+
+def selection(pop):
+ chosen = []
+ for i in range(len(pop)):
+ aspirants = []
+ macs = []
+ for j in range(3):
+ aspirants.append(random.choice(pop))
+ if fitness(aspirants[0]) > fitness(aspirants[1]):
+ if fitness(aspirants[0]) > fitness(aspirants[2]):
+ macs = aspirants[0]
+ else: macs = aspirants[2]
+ else:
+ if fitness(aspirants[1]) > fitness(aspirants[2]):
+ macs = aspirants[1]
+ else: macs = aspirants[2]
+ chosen.append(macs)
+ return chosen
+
+def crossover(offspring):
+ for child1, child2 in zip(offspring[::2], offspring[1::2]):
+ if random.random() < 0.7:
+ child1[50:100], child2[50:100]=child2[50:100], child1[50:100]
+
+def mutate(offspring):
+ for mut in offspring:
+ if random.random() < 0.3:
+ for i in range(len(mut)):
+ if random.random() < 0.05:
+ mut[i] = type(mut[i])(not mut[i])
+
+def gen_individ():
+ ind = []
+ for s in range(100):
+ ind.append(random.randint(0, 1))
+ return ind
+
+def gen_pop():
+ pop = []
+ for s in range(300):
+ pop.append(gen_individ())
+ return pop
+
+g = 0
+popul = gen_pop()
+print("length of pop = %i "% len(popul))
+fits = []
+for k in popul:
+ fits.append(fitness(k))
+print("best fitness before = %i"% max(fits))
+while(max(fits) < 100 and g < 100):
+ g += 1
+ offspring = []
+ offspring = selection(popul)
+ crossover(offspring)
+ mutate(offspring)
+ popul.clear()
+ popul[:] = offspring
+ fits.clear()
+ for k in popul:
+ fits.append(fitness(k))
+print("lenght of pop = %i "% len(popul))
+print("best fitness after = %i"% max(fits))
+print("generation : %i"%g)
+
","The problem seems to be that in all your functions, you always just modify the same individuals instead of creating copies. For instance, in the selection function you repeatedly select the best-out-of-three (in a rather convoluted way), and then insert multiple references to that same list into the chosen list. Later, when you mutate any of those, you mutate all the references. In the end you might even end up with just N references to the same list, at which point obviously no more actual selection can take place.
+Instead, you should create copies of the lists. This can happen in different places: in your main method, in mutate and recombine, or in the selection for the next iteration. I'll put it into selection, mainly for the reason that this function can be improved in other ways, too:
+def selection(pop):
+ chosen = []
+ for i in range(len(pop)):
+ # a lot shorter
+ aspirants = random.sample(pop, 3)
+ macs = max(aspirants, key=fitness)
+ # create COPIES of the individual, not multiple references
+ chosen.append(macs[:])
+ return chosen
+
+With this, you should get a quality of 100 each time.
",python
+"Connect to MariaDB database on Synology NAS from SQLalchemy in python issuefurthering my question here, I am trying to put this question in a simpler way.
+Following this tutorial, I am trying to start a connection to my Mariadb database in my NAS with SQLalchemy remotely. Here is the code:
+# Module Imports
+import mariadb
+import sys
+
+
+user = "my_name"
+passwd = "my_pass"
+host = "192.168.1.111"
+db = "test"
+port= "3307"
+
+# Connect to MariaDB Platform
+try:
+ conn = mariadb.connect(
+ user=user,
+ password=passwd,
+ host=host,
+ port=3307,
+ database=db
+
+ )
+except mariadb.Error as e:
+ print(f"Error connecting to MariaDB Platform: {e}")
+ sys.exit(1)
+
+# Get Cursor
+cur = conn.cursor()
+
+then I get this error:
+ERROR:root:Internal Python error in the inspect module.
+Below is the traceback from this internal error.
+
+Error connecting to MariaDB Platform: Can't connect to server on '192.168.1.111' (36)
+Traceback (most recent call last):
+ File "/var/folders/r5/wq0wq8mx0d56rbrbs38jt94w0000gn/T/ipykernel_39174/3834131737.py", line 14, in <module>
+ conn = mariadb.connect(
+mariadb.OperationalError: Can't connect to server on '192.168.1.111' (36)
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+ File "/Users/user/miniforge3/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3444, in run_code
+ exec(code_obj, self.user_global_ns, self.user_ns)
+ File "/var/folders/r5/wq0wq8mx0d56rbrbs38jt94w0000gn/T/ipykernel_39174/3834131737.py", line 24, in <module>
+ sys.exit(1)
+SystemExit: 1
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+ File "/Users/user/miniforge3/lib/python3.9/site-packages/IPython/core/ultratb.py", line 1101, in get_records
+ return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
+ File "/Users/user/miniforge3/lib/python3.9/site-packages/IPython/core/ultratb.py", line 248, in wrapped
+ return f(*args, **kwargs)
+ File "/Users/user/miniforge3/lib/python3.9/site-packages/IPython/core/ultratb.py", line 281, in _fixed_getinnerframes
+ records = fix_frame_records_filenames(inspect.getinnerframes(etb, context))
+ File "/Users/user/miniforge3/lib/python3.9/inspect.py", line 1541, in getinnerframes
+ frameinfo = (tb.tb_frame,) + getframeinfo(tb, context)
+AttributeError: 'tuple' object has no attribute 'tb_frame'
+
+I have brew installed mariadb-connector-cand ;
+brew installed mariadb; and
+pip installed PyMySQL.
+Would anyone please help?
","I echo with @Tim Roberts. Go to your NAS "Installed Package", click the MariaDB 10 app. In it please make sure the databases are sharable over the intranet or internet. It is called TCP/IP connection. Check it and your connection will be working.
+This is a point lots of people forget about when they first start up Mariadb.
",python
+"How to print a specific line/row from a text file?I have a text file as following:
+1, Max Mustermann, Male, 1000.0
+2, Nora Mustermann, Female, 790.0
+3, Tomas Mustermann, Male, 400.0
+
+i want to read the last value from the fourth column if i type the persons number and the number 1 to show the value. Unfortunately, I can't get any further from here. How can i do this?
+data_next = []
+for data in open("my_data.txt"):
+ liste = data.split(",")
+ number = int(liste[0])
+ name = liste[1]
+ sex = liste[2]
+ value = liste[3]
+ data_next.append(number)
+ print(f" [{number}] {name} {sex} {value}")
+
+
+which_person = int(input("WHICH PERSON?: "))
+
+if which_person in data_next:
+ print("Do you want to know the value? Press 1 for YES or 2 for NO")
+ next_input = int(input("YOUR INPUT: "))
+ if next_input == 1:
+ print(value)
+ elif next_input == 2:
+ print("THX")
+
","The for-loop at the beginning of your code succeeds in placing all of the 'values' from the text file into one list.
+data_next = ["1000.0", "790.0", "400.0"]
+
+However, that list isn't useful for what you are trying to do below, which is to take input for one of the names from the data file and print the corresponding number. What you want for this is a dictionary, which links from one value to another. To achieve this, use the following code:
+# set data_next to an empty dictionary
+data_next = {}
+for data in open("my_data.txt"):
+ liste = data.split(",")
+ number = int(liste[0])
+ name = liste[1]
+ sex = liste[2]
+ value = liste[3]
+ # set the current number equal to the current value in the dictionary
+ data_next[number] = value
+ print(f" [{number}] {name} {sex} {value}")
+
+
+which_person = int(input("WHICH PERSON?: "))
+
+# data_next.keys() will return an iterable of all of the id numbers in the dictionary
+if which_person in data_next.keys():
+ print("Do you want to know the value? Press 1 for YES or 2 for NO")
+ next_input = int(input("YOUR INPUT: "))
+ if next_input == 1:
+ # data_next[which_person] will return the value in the dictionary assigned to the number which_person
+ print(data_next[which_person])
+ elif next_input == 2:
+ print("THX")
+
+Another thing:
+This code will perform slightly differently than you expect because of how the file is formatted When you split each line by ",".
+"1, Max Mustermann, Male, 1000.0".split(",") == ["1", " Max Mustermann", " Male", " 1000.0"]
+
+As you can see, the spaces after each comma are included with the next value. To fix this, change the split statement to split(", ") instead of split(",") OR remove the spaces after the commas in the file.
",python
+"How to convert the last number of Str = (2021GC110) in Int on PythonI'm begginer in Django and i trying convert Str where the base for this is (2021(year) - CG(product name) - 1(ID product) -101 (var the product).
+But I need the last number for variable.
+exemple:
+product 1: 2021CG1101
+product 2: 2021CG1102
+this is my view.py
+ if serialNumberForm.is_valid():
+ os = serialNumberForm.save(commit=False)
+ Produto.numeroSerie = NumeroSerie.id
+ os.numeroSerie = id
+ lastProduct = NumeroSerie.objects.last()
+
+ if lastProduct == None:
+ prefix = datetime.date.today().year
+ fix = product.nome[3:6]
+ sufix = Produto.id
+ var = 10
+ os.serialNumber = str(prefix) + fix + str(sufix) + str(var)
+
+ elif int(lastProduct.serialNumber[0:3]) != datetime.date.today().year:
+ prefix = datetime.date.today().year
+ fix = product.nome[3:6]
+ sufix = Produto.id
+ var = 10
+ os.serialNumber = str(prefix) + fix + str(sufix) + str(var)
+
+ else:
+ prefix = datetime.date.today().year
+ fix = product.nome[3:6]
+ sufix = NumeroSerie.produto(os)
+ var = (lastProduct.serialNumber[-1]) =+ 1
+ os.serialNumber = str(prefix) + fix + str(sufix) + str(var)
+
+ os.save()
+
+
","This looks like a task for regular expressions:
+import re
+reg = re.compile(r"(?P<year>\d{4})(?P<group>[A-Z]{2})(?P<number>\d+)")
+match = reg.match("2021CG1101")
+if match is not None:
+ result = match.groupdict()
+ print(result['year'])
+ print(result['group'])
+ print(result['number'])
+
",python
+"I keep getting positional argument errors when trying to use a method from another class in a new classI have a ball class defined in Python, as follows, which is initialised with a mass = 1, radius = 1, and then we can set its position and velocity vectors as 2d numpy arrays.
+class Ball():
+
+ def __init__(self, pos = np.array([0,0]), vel = np.array([0,0]), mass = 1, radius = 1):
+ self.mass = mass
+ self.radius = radius
+ self.pos = pos
+ self.vel = vel
+
+I also have a daughter class of Ball, called Container, which is essentially a large ball of radius 10, as follows:
+class Container(Ball):
+ def __init__(self, radius = 10):
+ self.radius = radius
+
+The ball class also has three methods which I would like to use in a new class, called Simulation. These methods are defined in the ball class, as follows (with the parameter other simply being another ball that the self ball collides with):
+ def move(self, dt):
+ self.dt = dt
+ return np.add((self.pos),(np.dot((self.vel),self.dt)))
+
+ def time_to_collision(self, other):
+ self.other = other
+ self.posdif = np.subtract(self.pos, other.pos)
+ self.veldif = np.subtract(self.vel, other.vel)
+ self.posdif = np.subtract(self.pos, other.pos)
+ self.veldif = np.subtract (self.vel, other.vel)
+ self.raddif = self.radius - other.radius
+ return (-2*np.dot(self.posdif, self.veldif) + np.sqrt(4*(np.dot(self.posdif, self.veldif)**2)-4*np.dot(self.veldif, self.veldif)*(np.dot(self.posdif, self.posdif)-np.dot(self.raddif, self.raddif))))/(2*np.dot(self.veldif, self.veldif))
+
+ def collide(self, other):
+ self.other = other
+ return self.vel - (np.dot(self.veldif, self.posdif)*self.posdif)/(np.dot(self.posdif,self.posdif))
+
+Apologies for the long calculation, but I do not think that calculation line is relevant to the problem necessarily, just included it for completeness. These methods, move, time_to_collision and collide are to be used in another class, Simulation. The simulation class is defined as follows:
+class Simulation():
+ def __init__(self, ball = Ball(), container = Container()):
+ self._container = container
+ self._ball = ball
+
+ def next_collision(self):
+ return self._ball.move(self._ball.time_to_collision(self._ball, self._container))
+
+The simulation class aims to be initialised with a Ball object, and a Container object. It then has a method, next_collision, (with the only parameter being self) which uses the methods time_to_collision to work out the time between the collision of the ball and the container, and then it will use move to move the system to that moment in time, and then perform the collision using collide. The situation looks like this if a visualisation might help:
+![]()
+I have tried to achieve this in my next_collision(self): method, but I am always getting the same type error:
+TypeError: time_to_collision() takes 2 positional arguments but 3 were given
+
","Your next_collision method takes just two arguments and you are passing three, just like an error says.
+The self argument is passed automatically when you call the method on an object. So you should use it as self._ball.time_to_collision(self._container) to achieve what you want.
",python
+"Z3: Find if variable is multiple of some other numberI would like to create a constraint that makes sure that a Real value is quantized to some tick value.
+TICK = 0.5
+x = Real('x')
+solve(x % TICK == 0)
+
+Unfortunately, this does not work with Real numbers (it works with Int and FP).
+Another solution that I thought of was to create a set of valid numbers and check whether the number is part of the set, however the set would need to be really big.
+Is there any other solution?
","As Christoph mentioned, computing the modulus of a real-number isn't really meaningful. But your question is still valid: You're asking if x is an integer multiple of TICK. You can do this as follows:
+from z3 import *
+
+TICK = 0.5
+x = Real('x')
+k = Int('k')
+solve(x == 1, ToReal(k) * TICK == x)
+solve(x == 1.2, ToReal(k) * TICK == x)
+
+This prints:
+[k = 2, x = 1]
+no solution
+
+Note that unless x is a constant, this'll lead to a mixed integer-real arithmetic, and it might give rise to non-linear constraints. This can make it hard for the solver to answer your query, i.e., it can return unknown or take too long to respond. It all depends on what other constraints you have on x.
",python
+"Pyside2 trigger messagebox from button click closes whole applicationi want to trigger a messagebox from a button click without closing the entire application, I managed to do this in my previous project, but this time, the behavior is really unexpected. as soon as I clicked ok or the cross sign on the messagebox, the app also closing too without any error. here's the minimal example
+class MainWindow(QtWidgets.QMainWindow):
+ def __init__(self):
+ super(MainWindow, self).__init__()
+ self.set_and_load_ui()
+
+ def set_and_load_ui(self):
+ self.ui = QUiLoader().load('main.ui', self)
+
+ def trigger_messagebox(self):
+ self.messagebox()
+
+ def messagebox(self,x=""):
+ msg = QMessageBox(self)
+ msg.setIcon(QMessageBox.Information)
+ msg.setText("{}".format(x))
+ msg.setInformativeText("test")
+ msg.setWindowTitle("test")
+ msg.exec_()
+
+app = QtWidgets.QApplication(sys.argv)
+mainWindow = MainWindow()
+mainWindow.ui.show()
+# mainWindow.trigger_messagebox() #this is fine
+mainWindow.ui.mainmenuButton.clicked.connect(lambda: mainWindow.trigger_messagebox()) #this is not fine
+exit_code = (app.exec_())
+
+just to check, if the messagebox itself is causing the problem, I tried to call it directly on my code, but turns out it just fine, but not when I tried to trigger it by a button click
","The problem is related to the fact that QUiLoader loads the UI using the parent given as argument, and since the UI is already a QMainWindow, you're practically loading a QMainWindow into another, which is unsupported: QMainWindows are very special types of QWidgets, and are intended to be used as top level widgets.
+It's unclear to me the technical reason for the silent quit, but it's certainly related to the fact that the loaded window actually has a parent (the MainWindow instance), even if that parent is not shown (since you're showing the ui).
+Unfortunately, PySide doesn't provide a way to "install" a UI file to an existing instance unlike PyQt does (through the uic.loadUi function), so if you want to keep using PySide there are only two options:
+
+do not use a main window in Designer, but a plain widget ("Widget" in the "New Form" dialog of Designer), load it using QUiLoader and use setCentralWidget() (which is mandatory, since, as the documentation also notes, "creating a main window without a central widget is not supported"):
+ self.ui = QUiLoader().load('main.ui', self)
+ self.setCentralWidget(self.ui)
+
+The downside of this approach is that you cannot use the main window features anymore in Designer (so, no menus, status bar, dock widgets or toolbars).
+
+use pyside-uic to generate the python code and use the multiple inheritance method to "install" it; the following assumes that you exported the ui file as mainWindow.py:
+ from mainWindow import ui_mainWindow
+
+ class MainWindow(QtWidgets.QMainWindow, ui_mainWindow):
+ def __init__(self):
+ super(MainWindow, self).__init__()
+ self.setupUi(self)
+ # no "set_and_load_ui"
+ self.mainmenuButton.clicked.connect(self.trigger_messagebox)
+
+ # ...
+
+ app = QtWidgets.QApplication(sys.argv)
+ mainWindow = MainWindow()
+ mainWindow.show()
+ exit_code = (app.exec_())
+
+As you can see, now you can access widgets directly (there's no ui object). The downside of this approach is that you must remember to always generate the python file everytime you modify the related ui.
+
+
",python
+"os. commands & open() on AWS S3 objecti want to implement an aws lambda function that will execute the following python script:
+directory = os.fsencode(directory_in_string)
+
+def transform_csv(csv):
+
+ for file in os.listdir(directory):
+ filename = os.fsdecode(file)
+
+ d = open(r'C:\Users\r.reibold\Documents\GitHub\groovy_dynamodb_api\historische_wetterdaten\{}'.format(filename))
+
+ data = json.load(d)
+
+ df_historical = pd.json_normalize(data)
+
+ #Transform to datetime
+ df_historical["dt"] = pd.to_datetime(df_historical["dt"], unit='s', errors='coerce').dt.strftime("%m/%d/%Y %H:%M:%S")
+
+ df_historical["dt"] = pd.to_datetime(df_historical["dt"])
+
+.
+.
+.
+.
+
+
+My question is now:
+How do i have to change the os. commands because i need to reference to the s3 bucket and not my local directory?
+My first attempt looks like this
+DIRECTORY = 's3://weatherdata-templates/historische_wetterdaten/New/'
+BUCKET = 'weatherdata-templates'
+
+s3 = boto3.client('s3')
+paginator = s3.get_paginator('list_objects_v2')
+pages = paginator.paginate(Bucket=BUCKET, Prefix=DIRECTORY)
+
+def lambda_handler(event, context):
+
+ for page in pages:
+ for obj in page['Contents']:
+
+ filename = s3.fsdecode(obj)
+
+ d = open(r's3://102135091842-weatherdata-templates/historische_wetterdaten/New/{}'.format(filename))
+
+ data = json.load(d)
+
+ df_historical = pd.json_normalize(data)
+.
+.
+.
+
+
+Am i on the right track or completely wrong?
+Thx.
","Not quite there yet :)
+Unfortunately, you can't do open(...) directly on an S3 URL as it's not a file object.
+To load the object contents without storing the file locally, try using the S3 Boto3 resource which provides higher-level access to the S3 SDK.
+
+- Get the key of the object from
obj['Key'].
+- Use
obj.get()['Body'] to get the contents as a StreamingBody
+- Call
.read() on the StreamingBody to get the object in byte format & decode it to a UTF-8 string (or any other encoding that your file(s) is in)
+- Convert JSON string to a JSON object using
json.loads(...)
+
+import boto3
+s3_resource = boto3.resource('s3')
+...
+def lambda_handler(event, context):
+ for page in pages:
+ for obj in page['Contents']:
+ obj_reference = s3_resource.Object(BUCKET, obj['Key'])
+ body = obj_reference.get()['Body'].read().decode('utf-8')
+ data = json.loads(body)
+ df_historical = pd.json_normalize(data)
+ ...
+
",python
+"How to make the tkinter Treeview row entries icon appear as default?I have a strange Treeview behaviour that I can't resolve.
+I created a widget using a ttk.Frame widget and grid a ttk.Treeview in it as its child. The #0 column shows a directory tree. For each row entry, i.e. tree node, an icon would appear. The option open=True in the .insert() method was used.
+When this script is executed, all the icons appear in the Treeview as a default.
+However, when this same widget is imported into another script and added into a ttk.PanedWindow widget, the icons in the Treeview does not appear immediately. The icons would only appear when the Treeview row entry is opened by clicking on it. A second click would close the Treeview row entry and the icons would disappear.
+I would like the icons in the ttk.Treeview to appear as a default. How do I do this?
","I found the cause of this phenomenon. When I customised the style of the ttk.Treeview widget in the other script, I discovered the value of the background option was missing a # symbol. The phenomenon of the ttk.Treeview icon not appearing as a default is caused by an incorrect syntax in the value of the background option.
+OS: Linux
+Test code reproducing the phenomenon caused by incorrect syntax.
+#!/usr/bin/python3
+# -*- coding: utf-8 -*-
+
+import tkinter as tk
+import tkinter.ttk as ttk
+
+
+class App(ttk.Frame):
+
+ def __init__(self, parent=None, *args, **kwargs):
+ super().__init__(parent)
+ self.parent = parent
+
+ # Create Treeview
+ self.tree = ttk.Treeview(self, column=('A', 'B'), selectmode='none', height=7)
+ self.tree.grid(row=0, column=0, sticky='nsew')
+
+ # Setup column heading
+ self.tree.heading('#0', text=' Pic directory', anchor='center')
+ self.tree.heading('#1', text=' A', anchor='center')
+ self.tree.heading('#2', text=' B', anchor='center')
+ # #0, #01, #02 denotes the 0, 1st, 2nd columns
+
+ # Setup column
+ self.tree.column('A', anchor='center', width=100)
+ self.tree.column('B', anchor='center', width=100)
+
+ # Insert image to #0
+ # change to your file path
+ self._img = tk.PhotoImage(file="./imagename.png")
+ self.tree.insert('', 'end', text="#0's text", image=self._img,
+ value=("A's value", "B's value"))
+
+
+if __name__ == '__main__':
+ # Works
+ root = tk.Tk()
+ root.geometry('400x180+300+300')
+ app = App(root)
+ app.grid(row=0, column=0, sticky='nsew')
+ root.rowconfigure(0, weight=1)
+ root.columnconfigure(0, weight=1)
+ root.mainloop()
+
+ # Don't Works - syntax error in value of background causing missing icon
+ # in Treeview.
+ root = tk.Tk()
+ root.geometry('400x180+300+300')
+ style = ttk.Style()
+ style.configure('Treeview', background='303495') # should be '#303495'
+ app = App(root)
+ app.grid(row=0, column=0, sticky='nsew')
+ root.rowconfigure(0, weight=1)
+ root.columnconfigure(0, weight=1)
+ root.mainloop()
+
",python
+"Efficient deduplication in PythonI have coded a little code who attribute, to each element of a list, a score... To do this, I need to do this (simplified code):
+group={1:["Jack", "Jones", "Mike"],
+ 2:["Leo", "Theo", "Jones", "Leo"],
+ 3:["Tom", "Jack"]}
+
+already_chose=["Tom","Mike"]
+result=[]
+
+for group_id in group:
+ name_list = group[group_id]
+ y=0;x=0
+ repeat=[]
+ for name in name_list:
+ if name in already_chose:
+ y+=1
+ elif name not in repeat:
+ x+=1
+ repeat.append(name)
+ score_group=x-y
+ result.append([group_id,score_group])
+
+output: [[1, 1], [2, 3], [3, 0]]
+The issue is, if you read this code, that it's not optimized to a big enumeration (more than 7000 groups and 100 names by groups)...
+I hope someone can help me ? Thanks a lot
","IIUC, you want to get the length of the set of the unique names not in already_chose minus the number of names in already_chose.
+This is easily achieved with python sets and a list comprehension. The advantage in using python sets, is that operations are very fast due to hashing of the elements.
+[[k, len(set(v).difference(already_chose))-len(set(v).intersection(already_chose))]
+ for k,v in group.items()]
+
+output: [[1, 1], [2, 3], [3, 0]]
+NB. might be more useful as dictionary comprehension:
+{k: len(set(v).difference(already_chose))-len(set(v).intersection(already_chose))
+ for k,v in group.items()}
+
+output: {1: 1, 2: 3, 3: 0}
",python
+"Initializing python object using different object types as argumentsI'v read how to overload... and multiple constructors in python and a few more on this topic, however I'm looking for something more specific.
+I'm given a list of Content objects and a list of Data objects:
+class Content:
+ def __init__(self):
+ self._title = 'movie title'
+ self._url = 'http://movie-url'
+
+ @property
+ def title(self):
+ return self._title
+
+ @property
+ def url(self):
+ return self._url
+
+
+class Data:
+ def __init__(self):
+ self._title = 'movie title'
+ self._year = 2021
+ self._rating = 7.6
+
+ @property
+ def title(self):
+ return self._title
+
+ @property
+ def year(self):
+ return self._year
+
+ @property
+ def rating(self):
+ return self._rating
+
+I want to match each Content with it's corresponding Data by the title property and combine everything under one class Movie, by passing one of the other objects to Movie's init argument:
+movie_content = Movie(Content())
+movie_data = Movie(Data())
+
+From what I'v read so far my options are:
+
+- Default arguments: Doesn't seem to fit here (correct me if I'm wrong) since I want to pass only one argument anyway.
+*args: I prefer to avoid passing a long line of arguments (there will be at least 12).
+@classmethod: This approach is the most appealing to me, but I'm struggling with the implementation:
+
+class Movie:
+ def __init__(self, object):
+ self.object = object
+
+ @classmethod
+ def by_content(cls, content):
+ _title = content.title
+ _url = content.url
+
+ return cls( # what goes here?)
+
+ @classmethod
+ def by_data(cls, data):
+ _title = data.title
+ _year = data.year
+ _rating = data.rating
+
+ return cls( # what goes here?)
+
+
+- Using methods as multi-setters, which I'm currently using (not the most pythonic from my understanding):
+
+class Movie:
+ def __init__(self):
+ self._title = ''
+ self._url = ''
+ self._year = 0
+ self._rating = 0.0
+
+ def by_content(self, content):
+ self._title = content.title
+ self._url = content.url
+
+ def by_data(self, data):
+ self._title = data.title
+ self._year = data.year
+ self._rating = data.rating
+
+Any thoughts or suggestions would be greatly appreciated.
","You can use the classmethods as secondary constructors - where you will use the values from the second parameter in the function to fill out the attributes in your Movie class's constructor.
+I have made the example using dataclasses since that makes the code shorter, with the same functionality. (or more)
+from dataclasses import dataclass
+
+@dataclass
+class Content:
+ title: str
+ url: str
+
+@dataclass
+class Data:
+ title: str
+ year: str
+ rating: float
+
+@dataclass
+class Movie:
+ title: str
+ year: str
+ rating: float
+ url: str
+
+ @classmethod
+ def by_content(cls, content: Content): return cls(
+ title = content.title,
+ url = content.url,
+ rating = 0,
+ year = 0,
+ )
+
+ @classmethod
+ def by_content(cls, data: Data): return cls(
+ title = data.title,
+ url = 'https://null.com',
+ rating = data.rating,
+ year = data.year,
+ )
+
",python
+"ModuleNotFoundError: No module named 'flask_mail'I am creating a web application where the user will receive a confirmation mail after they've registered.
+But I am getting an error in the command line, Traceback (most recent call last):
+File "/usr/local/lib/python3.9/site-packages/flask/cli.py", line 240, in locate_app
+import(module_name)
+File "/home/ubuntu/flask/registration/application.py", line 3, in
+from flask_mail import Mail, Message
+ModuleNotFoundError: No module named 'flask_mail' and an error on the web page, Internal Server Error
+The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
+application.py
+import os
+from flask import Flask, redirect, render_template, request
+from flask_mail import Mail, Message
+from cs50 import SQL
+
+app = Flask(__name__)
+app.config["MAIL_DEFAULT_SENDER"] = os.getenv("MAIL_DEFAULT_SENDER")
+app.config["MAIL_PASSWORD"] = os.getenv("MAIL_PASSWORD")
+app.config["MAIL_PORT"] = 587
+app.config["MAIL_SERVER"] = smtp.gmail.com
+app.config["MAIL_USE_TLS"] = True
+app.config["MAIL_USERNAME"] = os.getenv("MAIL_USERNAME")
+mail = Mail(app)
+
+
+db = SQL("sqlite:///froshims.db")
+REGISTRANTS = {}
+
+SPORT=["Cricket", "Football", "Badminton", "Kho-Kho", "Kabaddi"]
+
+@app.route("/", methods = ["GET", "POST"])
+def index():
+ if request.method == "GET":
+ return render_template("index.html", sports = SPORT)
+ email = request.form.get("email")
+ sport = request.form.get("sport")
+ if not email:
+ return render_template("failure.html", message="E-mail not entered")
+ if not sport:
+ return render_template("failure.html", message="Sport not selected")
+ if sport not in SPORT:
+ return render_template("failure.html", message="Sport not in list. Don't try to hack our website.")
+ if request.method == "POST":
+ REGISTRANTS[name]=sport
+ print("yes")
+ db.execute("INSERT INTO registrants (name, sport) VALUES (?,?)", name, sport)
+ message = Message("You are registered!", recipients=[email])
+ mail.send(message)
+ return redirect("success")
+
+@app.route("/success")
+def success():
+ print("kaam kar na")
+ registrants = db.execute("SELECT * FROM registrants")
+ return render_template("success.html", registrants = registrants)
+
+I am unable to figure out the issue. I am new to Flask. Please guide me.
","I had the same problem with flask_mail import. pip installing Flask-Mail in a virtual environment on a code editor alone might not cause a module import error.
+Go to your command line into the app directory
+Also, make sure the pip version in your venv environment is up to date.
+pip install Flask-Mail
+
+This worked for me.
",python
+"Vectorization of Product OverI am seeking a vectorized form of the following computation:
+import numpy as np
+D = 100
+N = 1000
+K = 10
+
+X = np.random.uniform(0, 1, (K, N))
+T = np.random.uniform(0, 1000, (D, N))
+out = np.zeros((D, K))
+
+for i in range(D):
+ for j in range(K):
+ out[i, j] = np.prod(X[j, :] ** T[i, :])
+
+
+There are einsum-style things I've tried, but the presence of the np.prod is throwing me off a bit.
+EDIT: Reduced size of matrices.
","I'm trying to make the broadcasting as explicit as possible - the None introduces an additional dummy dimension of size 1:
+out = np.prod(X[None, :, :] ** T[:, None, :], axis=2)
+
+It is easy to see how it works if we recall the shapes: X.shape = (K, N), T.shape = (D, N) and out.shape = (D, K). With the dummy dimension we basically take something of (1, K, N) to the power of (D, 1, N) which results in (D, K, N). Finally if we reduce via product over the last dimension we get our desired output of (D, K).
",python
+"matching query does not exist. Django Errorthis is my code for my project, I just get this error I tried to figure it out but I don't get it,
+Django Error:
+
+DoesNotExist at /save_post/
+Profile matching query does not exist.
+views.py, line 75, in save_post
+form.authore = Profile.objects.get(user=request.user)
+
+views.py
+ @login_required
+ def save_post(request):
+ if request.method == "POST":
+ form = Post(content=request.POST['content'])
+ form.authore = Profile.objects.get(user=request.user)
+ form.save()
+ elif request.method == "PUT":
+ data = json.loads(request.body)
+ post_id = int(data["post_id"])
+ new_content = data["new_content"]
+ post = Post.objects.filter(id=post_id).first()
+ if post.authore.user != request.user:
+ return HttpResponse(status=401)
+ post.content = new_content
+ post.save()
+ return JsonResponse({
+ "result": True
+ }, status=200)
+ else:
+ return JsonResponse({
+ "error": "post not found"
+ }, status=400)
+ return index(request)
+
+models.py
+ class User(AbstractUser):
+ pass
+
+
+class Profile(models.Model):
+ user = models.ForeignKey(User, on_delete=models.CASCADE)
+
+
+class Post(models.Model):
+ content = models.TextField()
+ timestamp = models.DateTimeField(default=timezone.now)
+ authore = models.ForeignKey(Profile, on_delete=models.CASCADE)
+ likes = models.PositiveIntegerField(default=0, blank=True, null=True)
+
+ def serialize(self):
+ return {
+ "id": self.id,
+ "content": self.content,
+ "timestamp": self.timestamp.strftime("%b %#d %Y, %#I:%M %p"),
+ "authore": self.authore.id,
+ "username": self.authore.user.username,
+ "likes": self.likes.count(),
+ }
+
","this seems correct but you already this user you are using doesn't have profile
+so here my notes :
+You can use get_object_or_404 this if profile not found will return not found
+from django.shortcuts import get_object_or_404
+form.authore = get_object_or_404(Profile, user=request.user)
+
+when user-created no logic here profile must created so it is logical user doesn't have profile
+it is better to use one_to_one relation ship when creating user profile as each user has only one profile and vice versa
+to make it correct working refer here
+https://simpleisbetterthancomplex.com/tutorial/2016/07/22/how-to-extend-django-user-model.html
+
+go to Extending User Model Using a One-To-One Link in previous link
+
+field execute this tutorial and user profile will created auto when user created
",python
+"How can I say to Python to do an instruction at a given time?I want that a specific time of the day (for example 10:00:00), one of my if condition activates.
+For example:
+if time is 10:00:00:
+print("Hello world")
+Imortant: I already read this: Python script to do something at the same time every day
+But I don't want to use a function!
","You could easy use datetime to help you with that.
+import datetime
+from time import sleep
+
+timing = [10, 0, 0] # Hour, minute, second, in 24 hour time
+
+while True: # Repeat forever
+ now = datetime.datetime.now()
+ data = [now.hour, now.minute, now.second]
+ if data == timing:
+ # Code to be executed
+ print("Hello World")
+ #######
+ sleep(1) # To ensure the command is not repeated again
+ # break # Uncomment this if you want to execute the command only once
+
+Make sure that I indented it properly, because one space can tick python off :).
+The way that it works:
+import datetime and from time import sleep import the necessary modules and functions that you will need.
+Modules needed:
+datetime
+time.sleep
+Now we're set.
+timing = [10,0,0] sets the time that you want to use (you'll see why later)
+while True repeats the loop... on and on and on.
+now = datetime.datetime.now() creates a shortcut for such a long piece of text.
+data == timing makes sure the time matches the timing you asked.
+
+Note that the timing is in UTC
+Go to Getting the correct timezone offset in Python using local timezone to know how to find your offset.
+
+An offset of UTC-0200 (Or -7200 seconds) means that you need to ADD 2 hours to your time to get UTC. Or, if your time zone is UTC+0200, SUBSTRACT 2 hours from your time.
",python
+"Using dict values to count iterations over a list to insertimport random
+from itertools import repeat
+
+races_per_season = {
+ '2015' : "19",
+ '2016' : "21",
+ '20116' : "21",
+ '2017' : "20",
+ '2018' : "21",
+ '2019' : "21",
+ '2020' : "17",
+ '2021' : "16"
+}
+
+tmp_list = list(repeat(random.sample(range(80),10), 156))
+total_races = 0
+for k,v in races_per_season.items():
+ while total_races < int(v):
+ tmp_list[total_races].insert(1, k)
+ total_races += 1
+ break # inserting breaks here and below, somewhat works, but only gives me the first year throughout the list
+ break
+
+
+for x in tmp_list:
+ print(x)
+
+I am trying to use the dict values to iterate over a list of list and insert the key into the list at index 1. However, no matter how I try, it seems to iterate and insert all keys into the list then moves on to the next...
+This is the result I am seeing.... however by adding the breaks above, this continues throughout the list of 156 lists.. and doesn't change at list 19
+[[29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73]]
+
+but my desired result is the following.
+[[29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73],
+ [29, '2015', 56, 39, 31, 25, 37, 5, 16, 8, 73]...
+
+and continuing with '2015' 19 times, then inserting '2016' into the following 21 etc.. When I just print out the k,v pairs it works as I want it to, but I can't somehow convert that into a list. All values in the dict sum to the value of the len(tmp_list) 156
+ANY help would be more than appreciated. Thanks
","Check this
+import random
+from itertools import repeat
+
+races_per_season = {
+ '2015' : "19",
+ '2016' : "21",
+ '20116' : "21",
+ '2017' : "20",
+ '2018' : "21",
+ '2019' : "21",
+ '2020' : "17",
+ '2021' : "16"
+}
+# this line to create 156 lists that are not sharing the same reference
+tmp_list = [list(arr) for arr in repeat(random.sample(range(80),10), 156)]
+i = 0
+for k,v in races_per_season.items():
+ total_races = 0
+ while total_races < int(v):
+ tmp_list[i].insert(0, k)
+ tmp_list[i].insert(0, int(v))
+ total_races += 1
+ i+=1
+
+for x in tmp_list:
+ print(x)
+
",python
+"staticmethod decorator seems pointlessI was reading about the @staticmethod in Python when I came across tge following code:
+class MyClass:
+ my_var = 0
+
+ @staticmethod
+ def static_method():
+ MyClass.my_var += 1
+
+I just don't understand exactly why you can write a code like this... Doesn't it defeat the purpose of this method to be static?
+I get it that there's also the fact that the first parameter won't be a class/instance reference, but... Still weird to call this decorator like that if I still can access class variables, no?
+And if I can access class variables, why everywhere I read about it says that I cannot, even though I just clearly did with the code above? Is it just because I'm doing it wrong?
","The idea that a static method can't modify class state is based on the idea that the static method doesn't receive a reference to the class as an argument like a class method does. However, in this case, a reference to the class is provided as a hard-coded value.
+One reason for defining a static method rather than a class method is to guarantee that you modify the attribute of a specific class, rather than a possible subclass.
+class A:
+ my_var = 0
+
+ @classmethod
+ def foo(cls):
+ cls.my_var += 1
+
+ @staticmethod
+ def bar():
+ A.my_var += 1
+
+
+class B(A):
+ my_var = 0
+
+A call to B.foo will modify B.my_var, not A.my_var. A call to B.bar will modify A.my_var.
",python
+"How resize dataset label in albumentations label to work with tensorflow image_dataset_from_directory function?I am running the following code:
+[https://pastebin.com/LK8tKZtN]
+The error obtained is following:
+
+File "C:\Users\Admin\PycharmProjects\BugsClassfications\main2.py",
+line 45, in set_shapes *
+label.set_shape([])
+ValueError: Shapes must be equal rank, but are 1 and 0
+
+
+How correct function set_shape to work with image_dataset_from_directory?
+Here is my code:
+import tensorflow as tf
+import numpy as np
+import matplotlib.pyplot as plt
+from functools import partial
+from albumentations import (Compose, HorizontalFlip,Rotate)
+
+AUTOTUNE = tf.data.experimental.AUTOTUNE
+
+def process_image(image, label, img_size):
+ # cast and normalize image
+ image = tf.image.convert_image_dtype(image, tf.float32)
+ # apply simple augmentations
+ image = tf.image.random_flip_left_right(image)
+ image = tf.image.resize(image,[img_size, img_size])
+ return image, label
+
+transforms = Compose([
+Rotate(limit=40),
+HorizontalFlip()
+])
+
+
+def aug_fn(image, img_size):
+ data = {"image":image}
+ aug_data = transforms(**data)
+ aug_img = aug_data["image"]
+ aug_img = tf.cast(aug_img/255.0, tf.float32)
+ aug_img = tf.image.resize(aug_img, size=[img_size, img_size])
+ return aug_img
+
+
+def process_data(image, label, img_size):
+ aug_img = tf.numpy_function(func=aug_fn, inp=[image, img_size], Tout=tf.float32)
+ return aug_img, label
+
+
+def set_shapes(img, label, img_shape=(128,128,3)):
+ img.set_shape(img_shape)
+ label.set_shape([])
+ return img, label
+
+
+def view_image(ds):
+ image, label = next(iter(ds)) # extract 1 batch from the dataset
+ image = image.numpy()
+ label = label.numpy()
+
+ fig = plt.figure(figsize=(22, 22))
+ for i in range(20):
+ ax = fig.add_subplot(4, 5, i + 1, xticks=[], yticks=[])
+ ax.imshow(image[i].astype(dtype=np.uint8))
+ ax.set_title(f"Label: {label[i]}")
+ plt.show()
+
+
+train_dir = './dataset/train'
+img_size = 128
+data = tf.keras.utils.image_dataset_from_directory(train_dir, image_size=(img_size, img_size))
+print(data)
+
+#augmentation
+ds_alb = data.map(partial(process_data, img_size = 128), num_parallel_calls=AUTOTUNE).prefetch(AUTOTUNE)
+#resize
+ds_alb = ds_alb.map(set_shapes, num_parallel_calls=AUTOTUNE).batch(32)
+
+print(ds_alb)
+
","If you change the shape of your labels, it should work:
+def set_shapes(img, label, img_shape=(128,128,3)):
+ img.set_shape(img_shape)
+ label.set_shape([1,])
+ return img, label
+
+But you should ask yourself why you are even explicitly setting the shape of your data. Check this post.
",python
+"Can python cursor.execute accept multiple queries in one go?Can the cursor.execute call below execute multiple SQL queries in one go?
+cursor.execute("use testdb;CREATE USER MyLogin")
+
+I don't have python setup yet but want to know if above form is supported by cursor.execute?
+import pyodbc
+# Some other example server values are
+# server = 'localhost\sqlexpress' # for a named instance
+# server = 'myserver,port' # to specify an alternate port
+server = 'tcp:myserver.database.windows.net'
+database = 'mydb'
+username = 'myusername'
+password = 'mypassword'
+cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
+cursor = cnxn.cursor()
+#Sample select query
+cursor.execute("SELECT @@version;")
+row = cursor.fetchone()
+while row:
+ print(row[0])
+ row = cursor.fetchone()
+
","Multiple SQL statements in a single string is often referred to as an "anonymous code block".
+There is nothing in pyodbc (or pypyodbc) to prevent you from passing a string containing an anonymous code block to the Cursor.execute() method. They simply pass the string to the ODBC Driver Manager (DM) which in turn passes it to the ODBC Driver.
+However, not all ODBC drivers accept anonymous code blocks by default. Some databases default to allowing only a single SQL statement per .execute() to protect us from SQL injection issues.
+For example, MySQL/Connector ODBC defaults MULTI_STATEMENTS to 0 (off) so if you want to run an anonymous code block you will have to include MULTI_STATEMENTS=1 in your connection string.
+Note also that changing the current database by including a USE … statement in an anonymous code block can sometimes cause problems because the database context changes in the middle of a transaction. It is often better to execute a USE … statement by itself and then continue executing other SQL statements.
",python
+"Matplotlib pie chart wedges using color gradientI am trying to create a pie chart with each wedge having a different color gradient (e.g., yellow-green) instead of a single color (e.g., green). To further explain, the gradient should be set along the radius and not the circumference of the pie.
+Tried many options and did some research online but couldn't find a direct solution to this.
+Is there a library or approach I should take to achieve this?
+Thanks in advance.
","You can create an image with the desired gradient, and position and clip it via each wedge. LinearSegmentedColormap.from_list() interpolates between given colors.
+Here is an example:
+import matplotlib.pyplot as plt
+from matplotlib.colors import LinearSegmentedColormap
+import numpy as np
+
+fig, ax = plt.subplots()
+
+sizes = np.random.uniform(10, 20, 4)
+color_combos = [('yellow', 'green'), ('red', 'navy'), ('yellow', 'crimson'), ('lime', 'red')]
+wedges, texts = ax.pie(sizes, labels=['alpha', 'beta', 'gamma', 'delta'])
+xlim = ax.get_xlim()
+ylim = ax.get_ylim()
+for wedge, color_combo in zip(wedges, color_combos):
+ wedge.set_facecolor('none')
+ wedge.set_edgecolor('black')
+ print(wedge.theta1, wedge.theta2)
+ bbox = wedge.get_path().get_extents()
+ x0, x1, y0, y1 = bbox.xmin, bbox.xmax, bbox.ymin, bbox.ymax
+ x = np.linspace(x0, x1, 256)[np.newaxis, :]
+ y = np.linspace(y0, y1, 256)[:, np.newaxis]
+ # fill = np.sqrt(x ** 2 + y ** 2) # for a gradient along the radius, needs vmin=0, vmax=1
+ fill = np.degrees(np.pi - np.arctan2(y, -x))
+ gradient = ax.imshow(fill, extent=[x0, x1, y0, y1], aspect='auto', origin='lower',
+ cmap=LinearSegmentedColormap.from_list('', color_combo),
+ vmin=wedge.theta1, vmax=wedge.theta2)
+ gradient.set_clip_path(wedge)
+ax.set_xlim(xlim)
+ax.set_ylim(ylim)
+ax.set_aspect('equal')
+plt.show()
+
+At the left an example of a gradient along the angle, at the right a gradient along the radius.
+![]()
",python
+"Problem with Python module ""Schedule"" for timed fan controlI wrote the code below to control a fan.
+import schedule
+import time
+import RPi.GPIO as GPIO
+
+GPIO.setmode(GPIO.BCM)
+GPIO.setup(14, GPIO.OUT) #set pin 14 as output
+
+def aan():
+ GPIO.output(14,0) #set pin 14 to "low", fan comes on
+ print(time.strftime("%H:%M")+" aan") #print time in hours and minutes, aan=on in Dutch
+
+def uit():
+ GPIO.output(14,1) #set pin 14 to "high", fan goes off
+ print(time.strftime("%H:%M")+" uit") #print time in hours and minutes, uit=off in Dutch
+
+schedule.every().hour.at(":00").do(aan)
+schedule.every().hour.at(":01").do(uit)
+schedule.every().hour.at(":15").do(aan)
+schedule.every().hour.at(":16").do(uit)
+schedule.every().hour.at(":30").do(aan)
+schedule.every().hour.at(":31").do(uit)
+schedule.every().hour.at(":45").do(aan)
+schedule.every().hour.at(":46").do(uit)
+
+try:
+ while True:
+ if int(time.strftime("%H")) in range(9,21): #only preform the schedule between 9 and 21 hours
+ schedule.run_pending()
+ time.sleep(1)
+ else:
+ time.sleep(1)
+
+finally:
+ print("clean up")
+ GPIO.cleanup() # cleanup all GPIO pins
+
+It is supposed to run for 1 minute at the 4 specified times in the schedule. The code works, but I've noticed from the print in the terminal that it also comes on 4 times on 09:00 (see below).
+09:00 aan
+09:00 uit
+09:00 aan
+09:00 uit
+09:00 aan
+09:00 uit
+09:00 aan
+09:00 uit
+
+I've tried changing the :00 "on" and :01 "off" schedules, but this doesn't seem to make a difference. It would be greatly appreciated if someone could help me!
","I would address the problem a little differently as illustrated below:
+def fan():
+ # Operate the fan
+ on_time = 60 # Runtime in seconds
+ if int(time.strftime("%H")) in range(9,21): #only preform the schedule between 9 and 21 hours
+ GPIO.output(14,0) #set pin 14 to "low", fan comes on
+ print(f'Turn on Fan at {time.strftime("%H:%M")}')
+ time.sleep(on_time)
+ GPIO.output(14,1) #set pin 14 to "high", fan goes off
+ print(f'Turn Off Fan at {time.strftime("%H:%M")}')
+
+Then:
+schedule.every().hour.at(":00").do(fan)
+while True:
+ run_pending()
+ time.sleep(1)
+
+else:
+ print("clean up")
+ GPIO.cleanup() # cleanup all GPIO pins
+
+I can't fully test this code because I don't have your fan device, but this seems to work with just print statements.
",python
+"How to store the result from loop into the variable or the listI have the code below that runs on the active excel sheet to check specified cells.
+If the specified cell shows "Fail", it will print out the failed person and the time.
+import xlwings as xw
+import xlrd
+
+def check_result():
+ sheet = xw.books.active.sheets.active
+ for x in range(1, 5):
+ if sheet['B' + str(x)].value =="Fail":
+ print(sheet['A' + str(x)].value, xlrd.xldate_as_datetime(sheet['C' + str(x)].value, 0))
+
+check_result()
+
+Sample Data
+How can I save this printed result into the variable or the list?
+The excel file (.xlsm) is connecting to the third party software, and this file needs to be opened to generate the data.
","From the print statement, it looks like that there are three outputs. You can have a list or tuple to save the printed data like this.
+With List
+List = []
+List.append([sheet['A' + str(x)].value, xlrd.xldate_as_datetime(sheet['C' + str(x)].value, 0])
+
+NOTE: define List outside of your for loop.
+Let me know if there is any problem with the code.
",python
+"Find all partitions of n of length less-than-or-equal to LHow might I find all the partitions of n that have length less-than-or-equal-to L?
","Based on the code given here, we can include an additional argument L (which defaults to n).
+We might naively include if len((i,) + p) <= L: before yield (i,) + p. However, since len((i,) + p) = 1 + len(p), any partitions of n-i that are longer than L-1 are discarded. Thus time is wasted by finding them. Instead, we should include L=L-1 as an argument when finding partitions of n-1. We then need to deal with the L=0 case properly, by not running the main body:
+def partitions(n, L=None, I=1):
+ if L is None:
+ L = n
+
+ if L:
+ yield (n,)
+ for i in range(I, n//2 + 1):
+ for p in partitions(n-i, L-1, i):
+ yield (i,) + p
+
+Now if L=1, the for i loop will be executed, but none of the for p loops will since the partitions calls won't yield anything; we need not execute the for i loop at all in this case, which can save a lot of time:
+def partitions(n, L=None, I=1):
+ if L is None:
+ L = n
+
+ if L == 1:
+ yield (n,)
+ elif L > 1:
+ yield (n,)
+ for i in range(I, n//2 + 1):
+ for p in partitions(n-i, L-1, i):
+ yield (i,) + p
+
",python
+"How to split sentence in a list based on two words?I have a list of string like this lst = ['John Kim and Kerry Lin', 'John Cena', 'Kim Rai with Kaster Baldwin'], and I would like to split the words in list if they have and or with as separators such that the final outcome is ['John Kim', 'Kerry Lin', 'John Cena', 'Kim Rai', 'Kaster Baldwin']. How do I achieve this? My try was:
+to_ret = []
+for words in lst:
+ splitted = words.split(' and')
+ to_ret.extend(splitted)
+new_ret = []
+for words in to_ret:
+ splitted = words.split(' with')
+ new_ret.extend(splitted)
+
+but this looks very repetitive. Any suggestions for cleaner code?
","You could use regular expressions to handle the multiple delimiters, and chain to put all of the sublists into one.
+import re
+from itertools import chain
+lst = ['John Kim and Kerry Lin', 'John Cena', 'Kim Rai with Kaster Baldwin']
+
+output = [w.strip() for w in chain.from_iterable([re.split(r'and|with',x) for x in lst])]
+print(output)
+
+Output
+['John Kim', 'Kerry Lin', 'John Cena', 'Kim Rai', 'Kaster Baldwin']
+
",python
+"Creating a cube that is normal to an eigenspace in Matplotlib.pyplotI am trying to make a cube where all the sides are normal to each of the eigenvectors, as a way to visualize principle stresses given any possible normal and shear stresses in 3d. I've tried using simple rotation matrices and applying them to a list of points but there always seems to be some error and I'm not sure if it is in how I apply the rotation matrices, the angles I give them, or the order I use.
+import numpy as np
+import math
+import matplotlib.pyplot as plt
+from math import cos, sin
+
+ax = plt.axes(projection="3d")
+
+# normal and shear stresses
+Fx = float(input("Sigma X: "))
+Fy = float(input("Sigma Y: "))
+Fz = float(input("Sigma Z: "))
+Vxy = float(input("Tau xy: "))
+Vxz = float(input("Tau xz: "))
+Vyz = float(input("Tau yz: "))
+
+A = [[Fx, Vxy, Vxz],
+ [Vxy, Fy, Vyz],
+ [Vxz, Vyz, Fz]]
+
+eigval, eigvect = np.linalg.eig(A) # finding principle stresses and their directions
+
+# rounding off error
+eigvect = np.round(eigvect, 5)
+eigval = np.round (eigval, 5)
+
+# drawing eigenvectors or principle force directions
+ax.quiver(0, 0, 0, eigvect[0, 0] * 2, eigvect[1, 0] * 2, eigvect[2, 0] * 2, color="orange")
+ax.quiver(0, 0, 0, eigvect[0, 1] * 2, eigvect[1, 1] * 2, eigvect[2, 1] * 2, color="blue")
+ax.quiver(0, 0, 0, eigvect[0, 2] * 2, eigvect[1, 2] * 2, eigvect[2, 2] * 2, color="red")
+
+# drawing original normal force directions
+ax.quiver(0, 0, 0, 2 * Fx / np.abs(Fx), 0, 0, color="orange", linestyle="dashed")
+ax.quiver(0, 0, 0, 0, 2 * Fy / np.abs(Fy), 0, color="blue", linestyle="dashed")
+ax.quiver(0, 0, 0, 0, 0, 2 * Fz / np.abs(Fz), color="red", linestyle="dashed")
+
+# points used to draw the cube
+points = np.array([[1.0, 1.0, 1.0],
+ [1.0, -1.0, 1.0],
+ [-1.0, -1.0, 1.0],
+ [-1.0, 1.0, 1.0],
+ [1.0, 1.0, 1.0],
+ [1.0, 1.0, -1.0],
+ [1.0, -1.0, -1.0],
+ [1.0, -1.0, 1.0],
+ [1.0, -1.0, -1.0],
+ [-1.0, -1.0, -1.0],
+ [-1.0, -1.0, 1.0],
+ [-1.0, -1.0, -1.0],
+ [-1.0, 1.0, -1.0],
+ [-1.0, 1.0, 1.0],
+ [-1.0, 1.0, -1.0],
+ [1.0, 1.0, -1.0]])
+
+# finding the angles that i need to rotate the cube
+x = np.arctan2(eigvect[1, 2], eigvect[2, 2])
+y = np.arctan2(eigvect[2, 0], eigvect[0, 0])
+z = np.arctan2(eigvect[0, 1], eigvect[1, 1])
+
+# rotation matrices
+xrot = np.matrix([[1, 0, 0], [0, cos(y), -sin(y)], [0, sin(y), cos(y)]])
+yrot = np.matrix([[cos(x), 0, sin(x)], [0, 1, 0], [-sin(x), 0, cos(x)]])
+zrot = np.matrix([[cos(z), -sin(z), 0], [sin(z), cos(z), 0], [0, 0, 1]])
+
+# updating the points list to rotate the cube
+for i in range(len(points)):
+ points[i] = np.dot(points[i], xrot)
+ points[i] = np.dot(points[i], yrot)
+ points[i] = np.dot(points[i], zrot)
+
+# plotting the rotated cube
+ax.plot(points[:, 0], points[:, 1], points[:, 2], color="blue")
+
+ax.set_xlabel("X")
+ax.set_ylabel("Y")
+ax.set_zlabel("Z")
+
+plt.show()
+
+The most likely spot for the error is I think in either the angles that im using, variables x, y, and z, (in radians) or in the method I apply the rotation matrices. I could however be going about this completely wrong and maybe I should be using some other method of rotating the points.
+My main goal is to just align the cube with the eigenvectors as the cube is just a visual representation. If you have any ideas or know of a different way i could do this it would be greatly appreciated!
","I figured it out, so basicaly the angles were wrong and the rotation matrices were not general enough.
+What I first needed to do was align the cube with one of the axies, I did this by rotating it only about the z axis to one of my vectors, then rotate it along a new vector perpendicular to the first on the xy plane, and finaly along the vector its self. Heres the code I used to fix this: (I'll also add a link to the wikapedia article with the rotation matrices as that was where i figured it out) https://en.wikipedia.org/wiki/Rotation_matrix
+# angles used
+theta = - np.arctan2(eigvect[1, 0], eigvect[0, 0])
+Vxy = np.arccos(eigvect[2, 0])
+Uxyz = - np.arctan2(eigvect[1, 2], eigvect[2, 2])
+
+# z-axis rotation
+z_rot = np.matrix([[cos(theta), -sin(theta), 0],
+ [sin(theta), cos(theta), 0],
+ [0, 0, 1]])
+
+# making the perpendicular unit vector
+magnetude = np.sqrt(eigvect[1, 0] ** 2 + eigvect[0, 0] ** 2)
+vx = eigvect[1, 0] / magnetude
+vy = - eigvect[0, 0] / magnetude
+vz = 0
+
+# xy-plane rotation
+Vxy_rot = np.matrix([[cos(Vxy) + (vx ** 2) * (1 - cos(Vxy)), vx * vy * (1 - cos(Vxy)) - vz * sin(Vxy), vx * vz * (1 - cos(Vxy)) + vy * sin(Vxy)],
+ [vy * vx * (1 - cos(Vxy)) + vz * sin(Vxy), cos(Vxy) + (vy ** 2) * (1 - cos(Vxy)), vy * vz * (1 - cos(Vxy)) - vx * sin(Vxy)],
+ [vz * vx * (1 - cos(Vxy)) - vy * sin(Vxy), vz * vy * (1 - cos(Vxy)) + vx * sin(Vxy), cos(Vxy) + (vz ** 2) * (1 - cos(Vxy))]])
+
+# x, y, z components
+ux = eigvect[0, 0]
+uy = eigvect[1, 0]
+uz = eigvect[2, 0]
+
+# xyz rotation
+Uxyz_rot = np.matrix([[cos(Uxyz) + (ux ** 2) * (1 - cos(Uxyz)), ux * uy * (1 - cos(Uxyz)) - uz * sin(Uxyz), ux * uz * (1 - cos(Uxyz)) + uy * sin(Uxyz)],
+ [uy * ux * (1 - cos(Uxyz)) + uz * sin(Uxyz), cos(Uxyz) + (uy ** 2) * (1 - cos(Uxyz)), uy * uz * (1 - cos(Uxyz)) - ux * sin(Uxyz)],
+ [uz * ux * (1 - cos(Uxyz)) - uy * sin(Uxyz), uz * uy * (1 - cos(Uxyz)) + ux * sin(Uxyz), cos(Uxyz) + (uz ** 2) * (1 - cos(Uxyz))]])
+
+# applying the rotations
+for i in range(len(points)):
+ points[i] = np.dot(points[i], z_rot)
+ points[i] = np.dot(points[i], Vxy_rot)
+ points[i] = np.dot(points[i], Uxyz_rot)
+
",python
+"Creating relationship between models when in separate files FLASK, SQLalchemyHow can I properly establish a foreign key relationship between these two models when they are currently in separate folders?
+from sqlalchemy import Column, String, Integer, DateTime, ForeignKey
+from testserver.database import Base
+from datetime import datetime
+
+
+class Service(Base):
+ __tablename__ = 'services'
+
+ id = Column(Integer(), primary_key=True)
+ name = Column(String(25), nullable=False, unique=True)
+ description = Column(String(80), nullable=False)
+ date_created = Column(DateTime(), default=datetime.utcnow)
+ provider_id = Column(String, ForeignKey('provider.id'))
+
+from sqlalchemy import Column, String, Integer, DateTime
+from sqlalchemy.orm import relationship
+from testserver.database import Base
+from datetime import datetime
+from testserver.database import db_session
+
+
+class Provider(Base):
+ __tablename__ = 'providers'
+
+ id = Column(Integer(), primary_key=True)
+ name = Column(String(25), nullable=False, unique=True)
+ description = Column(String(80), nullable=False)
+ date_created = Column(DateTime(), default=datetime.utcnow)
+ services = relationship(
+ 'Service', cascade='all, delete-orphan', backref='provided_by')
+
+
+from sqlalchemy.ext.declarative import declared_attr
+
+
+class Base(object):
+ __abstract__ = True
+
+ @declared_attr
+ def __tablename__(cls):
+ return cls.__name__.lower()
+
+
+the code above currently throws an err:
+sqlalchemy.exc.NoReferencedTableError: Foreign key associated with column 'services.provider_id' could not find table 'provider' with which to generate a foreign key to target column 'id'
+
+any recommendations or solutions would be greatly apprecieted
","I assume that you have a typo here:
+provider_id = Column(String, ForeignKey('provider.id'))
+
+In relationship you provided lowercased model name, instead of table name. If you replace provider with providers (because it is a name of your table) it should fix your problem
+provider_id = Column(String, ForeignKey('providers.id'))
+
",python
+"Error converting datetime string to datetime objectI get an error when I try to convert a datetime string to a datetime object:
+df['R_DATE'] = pd.to_datetime(df['R_DATE'], format='%a %b %d %H:%M:%S %Z %Y')
+
+Error is:
+...
+File "pandas\_libs\tslibs\strptime.pyx", line 141, in pandas._libs.tslibs.strptime.array_strptime
+ ValueError: time data 'Mon Oct 18 00:00:00 EDT 2021'
+ does not match format '%a %b %d %H:%M:%S %Z %Y' (match)
+
+From what I can tell format appears to match the datetime string value. I'm not sure if the timezone value (EDT) is causing issues.
","nvm. found the answer I was looking for.
+import dateutil
+
+tzdict = {'EST': dateutil.tz.gettz('America/New_York'),
+ 'EDT': dateutil.tz.gettz('America/New_York')}
+
+df['R_DATE'] = df['R_DATE'].apply(dateutil.parser.parse, tzinfos=tzdict)
+
",python
+"How is the most efficient way to intersect a list of strings with a numpy array of matches?I am using aho corasick to performing some string searches on documents.
+The original code uses numpy array to store in an efficient way the matches of each string of a string list:
+import ahocorasick
+import numpy as np
+
+def ahocorasickFind(search_list, input):
+ A = ahocorasick.Automaton()
+ for idx, s in enumerate(search_list):
+ A.add_word(s, (idx, s))
+ A.make_automaton()
+
+ index_list = []
+ for item in A.iter(input):
+ print(item)
+ index_list.append(item[1][0])
+
+ output_list = np.array([0] * len(search_list))
+ output_list[index_list] = 1
+ return output_list.tolist()
+
+search_list = ['joão','maria','thiago'] # thousands of words in the real code
+result = ahocorasickFind(search_list,'asdasdasd joão 1231231 thiago') # huge text in the real code
+for index, element in enumerate(result):
+ if(element == 1):
+ print(search_list[index])
+
+Using the above approach took to much time and memory to iterate and test (if == 1).
+So, how to get "original" strings found in the input text in a perfomatic way?
","If you are only interested in matching for words (i.e. separated by a white space), rather than using a full search text, it might be faster to use a set of words. Note, however, that this uses some additional memory. One straightforward solution to replicate your behaviour would be:
+words = set(text.split())
+for w in search_list:
+ if w in words:
+ print(w)
+
+or even shorter (but changing the order of the result, and deleting duplicates from the search list):
+for w in set(search_list).intersection(text.split()):
+ print(w)
+
+I've quickly tested it on relatively large text object (143M characters, 23M words) and a rather short search_list object (606 words, of which 295 unique ones), and the times I got are:
+
+corasick: 14.5s
+- first version above: 4.6s
+- second version above: 2.6s (this speedup is just due to doing half the work only by skipping duplicates)
+
+However the first version uses a (relatively) negligible amount of additional memory, while the other versions use quite a lot of it (for the data I was using, could be almost 2GB of additional memory)
",python
+"Keep sub-sequences of a binary list if they surpass a given lengthI want to create a function that takes as input a list (or numpy array) A and a number L. A is full of 0 and 1 and the goal is to keep the sub-sequences of 1 if they surpass L in length. I wrote a function to do it fix(A,L) but it takes to long to run so I wanted to know if their is a faster way of doing this.
+def fix(A,L):
+ i=0
+ while True:
+ if i==len(A):
+ return(A)
+ if A[i]==1:
+ s=0
+ for j in range(i,len(A)):
+ if A[j]==1:
+ s+=1
+ continue
+ else:
+ if s>=L:
+ break
+ else:
+ A[i:j]=[0]*len(A[i:j])
+ break
+ if A[j]==1 and s<L:
+ A[i:j+1]=[0]*len(A[i:j+1])
+ i=j+1
+ else:
+ i+=1
+ continue
+
+if I call fix([1,0,0,1,1,1,0,1,1,1,1,0,1,1,0,1], 3) it returns [0,0,0,1,1,1,0,1,1,1,1,0,0,0,0,0] which is the correct answer.
","If you're working with 2D numpy arrays, what you want to achieve can be done using binary erosion and dilation. We can use scipy.ndimage.binary_erosion and binary_dilation
+We're doing it here only on a single dimension:
+np.random.seed(0)
+A = np.random.randint(0, 2, (10, 20))
+
+from scipy.ndimage import binary_dilation, binary_erosion
+
+L = 3
+mask = np.ones((1, L))
+binary_dilation(binary_erosion(A, mask), mask).astype(int)
+
+example input:
+array([[0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1],
+ [0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0],
+ [0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1],
+ [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0],
+ [0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0],
+ [1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0],
+ [1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0],
+ [1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1],
+ [0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0],
+ [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0]])
+
+output:
+array([[0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
+ [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
+ [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
+ [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
+
+Visual input/output:
+
➔ ![]()
",python
+"Python bs4 .find not detecting articlei'm trying to get names of products but when it gets to the sponsored products it returns None. Here's my code;
+ next_page_url = 'https://www.jumia.com.ng/catalog/?q=oraimo&shipped_from=country_local&page=1#catalog-listing'
+ result_nextpage = requests.get(next_page_url, headers=headers).text # headers are generated from default python 'fake_headers' module.
+ doc_nextpage = BeautifulSoup(result_nextpage, 'lxml') # I also tried other parsers
+ divs = doc_nextpage.find('div', class_='-paxs row _no-g _4cl-3cm-shs')
+ result_articles = divs.select('h3.name')
+ for i in result_articles:
+ print(i.string)
+
+Result;
+Oraimo FreePods-3 2Baba Edition BT 5.2 Wireless Stereo Earbuds
+Oraimo 27000mAh Massive Power Charing Bank Traveller 3 Byte
+Oraimo OPB-P116DN 10000 Mah Power-Bank Dual Fast Charging
+Oraimo FreePods3 True Wireless Stereo Earbuds IPX5 & Sweat Proof
+Oraimo Smart Watch 1.69'' IPS Screen IP68 Waterproof
+Oraimo FreePods-2 2Baba-version True Wireless Earbuds
+Oraimo Silver Edition Smart Watch 1.69'' IPS Screen IP68 Waterproof
+Oraimo Charger UKDualUSB OCW-U63D White
+Oraimo Portable Wireless Speaker Subwoofer Outdoor Sound Box
+Oraimo Charger Oraimo UKDualUSB OCW-U81F White
+Oraimo Power Oraimo Bank OPB-P206DN 20KmAh
+Oraimo SoundPro Wireless Speaker Muti-Model Music Play
+Oraimo Tempo-W3 Smart Watch Health Monitor IP67 Waterproof
+Oraimo Car Charger Oraimo OCC-21DML Black
+Oraimo SoundPro-2C 10W Portable Wireless Bluetooth Speaker
+Oraimo Necklace 5C Neckband Wireless Earphone
+Oraimo COMPACT 10000mAh Ultra Slim Fast Charging Power Bank
+Oraimo 10000mAh OPTIMIZED SLIM Power-bank With LED Light
+Oraimo Mermaid Half In-ear Earphone With Mic
+Oraimo Necklace 3 Lite Neckband BT 5.0 Wireless Earphone
+Oraimo Senior BT5.0 Single Wireless Bluetooth Headsets
+Oraimo True Wireless Bluetooth Earbuds- Freepods 2
+Oraimo FreePods-2 2Baba-version True Wireless Earbuds
+Oraimo Air-Buds-2S Super Bass Wireless Stereo Earbuds
+Oraimo 20000MAH Powerbank -long Lasting PowerBank
+Oraimo Bluetooth Wireless SOUNDBAR SPEAKER
+Oraimo Shark-2 BT5.0 In-Ear Wireless Bluetooth Headphones
+Oraimo BoomPop Over-Ear Bluetooth Wireless Headphone
+Oraimo 20000MAH Powerbank -long Lasting Power For Days
+Oraimo FreePods-2 2Baba-Version True Wireless Stereo Earbud
+Oraimo 2021 Latest Edition Smart Function Waterproof Smart Watch
+Oraimo OCW-U36S Efficient And Durable USB Charger - Black
+Oraimo FreePods-2 2Baba-Version True Wireless Stereo Earbud-white
+Oraimo 10000MAh Ultimate Slim Power Bank - Black
+Oraimo 20000MAH Powerbank - Power For Days
+Oraimo 10000mAh Ultra Slim Fast Charging Power Bank
+Oraimo 2020 Edition Tempo S - OSW-11 Multi Function Smart Watch
+Oraimo SOLID 27000mAh Massive Powerbank OPB-P271D Traveller 3 Byte
+Oraimo FreePods-3 2Baba Edition BT 5.2 Wireless Stereo Earbuds
+Oraimo Tempo-S IP67 Waterproof Smart Watch WITH AMAZING FUNCTIONS
+None
+None
+None
+None
+None
+None
+None
+None
+
+The article tag 41-48 are sponsored products which the names of the product are showing from the inspect element in the browser but bs4 isn't detecting it but it detects other non-sponspored.
+Please kindly help.
","Note First of all, take a look into your soup /doc_nextpage - There is the truth you processing the data on.
+What happens?
+In your doc_nextpage the html for your sponsored products is empty and thats why you get these None.
+They are empty because they will be provided dynamically by website and requests could not handle this. It is no browser, that will interpret / manipulate data.
+How to fix?
+One option is to simulate browser behavior with selenium and get page_source to process it with beautifulsoup or with selenium itself.
+Example (selenium 4)
+from bs4 import BeautifulSoup
+from selenium import webdriver
+from selenium.webdriver.common.by import By
+from selenium.webdriver.chrome.service import Service as ChromeService
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.support import expected_conditions as EC
+
+options = webdriver.ChromeOptions()
+service = ChromeService(executable_path='ENTER YOUR PATH TO CHROMEDRIVER')
+driver = webdriver.Chrome(service=service, options=options)
+driver.get('https://www.jumia.com.ng/catalog/?q=oraimo&shipped_from=country_local&page=1#catalog-listing')
+
+WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '[data-list="sponsored"]')))
+
+soup = BeautifulSoup(driver.page_source, 'lxml')
+
+print([x.text for x in soup.select('article h3.name')])
+
+driver.close()
+
+Output
+['Oraimo FreePods-3 2Baba Edition BT 5.2 Wireless Stereo Earbuds',
+ 'Oraimo 27000mAh Massive Power Charing Bank Traveller 3 Byte',
+ 'Oraimo OPB-P116DN 10000 Mah Power-Bank Dual Fast Charging',
+ 'Oraimo FreePods3 True Wireless Stereo Earbuds IPX5 & Sweat Proof',
+ "Oraimo Smart Watch 1.69'' IPS Screen IP68 Waterproof",
+ 'Oraimo FreePods-2 2Baba-version True Wireless Earbuds',
+ "Oraimo Silver Edition Smart Watch 1.69'' IPS Screen IP68 Waterproof",
+ 'Oraimo Charger UKDualUSB OCW-U63D White',
+ 'Oraimo Portable Wireless Speaker Subwoofer Outdoor Sound Box',
+ 'Oraimo Charger Oraimo UKDualUSB OCW-U81F White',
+ 'Oraimo Portable Source 10000mAh Po Wer Ba Nk Oraimo OPB-P110D',
+ 'Oraimo Power Oraimo Bank OPB-P206DN 20KmAh',
+ 'Oraimo SoundPro Wireless Speaker Muti-Model Music Play',
+ 'Oraimo Tempo-W3 Smart Watch Health Monitor IP67 Waterproof',
+ 'Oraimo Car Charger Oraimo OCC-21DML Black',
+ 'Oraimo SoundPro-2C 10W Portable Wireless Bluetooth Speaker',
+ 'Oraimo Necklace 5C Neckband Wireless Earphone',
+ 'Oraimo 10000mAh OPTIMIZED SLIM Power-bank With LED Light',
+ 'Oraimo COMPACT 10000mAh Ultra Slim Power Fast Charging Bank',
+ 'Oraimo Mermaid Half In-ear Earphone With Mic',
+ 'Oraimo Necklace 3 Lite Neckband BT 5.0 Wireless Earphone',
+ 'Oraimo Senior BT5.0 Single Wireless Bluetooth Headsets',
+ 'Oraimo True Wireless Bluetooth Earbuds- Freepods 2',
+ 'Oraimo Pilot 20000mAh 2.1A Fast Power Charging Bank',
+ 'Oraimo FreePods-2 2Baba-version True Wireless Earbuds',
+ 'Oraimo Air-Buds-2S Super Bass Wireless Stereo Earbuds',
+ 'Oraimo 20000MAH Powerbank -long Lasting PowerBank',
+ 'Oraimo Bluetooth Wireless SOUNDBAR SPEAKER',
+ 'Oraimo Shark-2 BT5.0 In-Ear Wireless Bluetooth Headphones',
+ 'Oraimo BoomPop Over-Ear Bluetooth Wireless Headphone',
+ 'Oraimo 20000MAH Powerbank -long Lasting Power For Days',
+ 'Oraimo FreePods-2 2Baba-Version True Wireless Stereo Earbud',
+ 'Oraimo OCW-U36S Efficient And Durable USB Charger - Black',
+ 'Oraimo 2021 Latest Edition Smart Function Waterproof Smart Watch',
+ 'Oraimo OCW-U36S Efficient And Durable USB Charger - Black',
+ 'Oraimo Firefly-2 5.0V/2.1A Dual USB Fast Wall Charger',
+ 'Oraimo FreePods-2 2Baba-Version True Wireless Stereo Earbud-white',
+ 'Oraimo 10000MAh Ultimate Slim Power Bank - Black',
+ 'Oraimo SOLID 27000mAh Massive Powerbank OPB-P271D Traveller 3 Byte',
+ 'Oraimo FreePods-3 2Baba Edition BT 5.2 Wireless Stereo Earbuds',
+ 'Oraimo Massive 27000mAh Travellers 3 Byte OPB-P271D Power Bank',
+ 'Oraimo 1.69" IPS Screen IP68 Waterproof Smart Watch Pro-Silver',
+ 'Oraimo Tempo-S IP67 Waterproof Smart Watch WITH AMAZING FUNCTIONS',
+ 'Oraimo FreePods-3 E104D 2Baba Edition BT 5.2 Wireless Earbuds',
+ 'Oraimo Tempo-S IP67 Waterproof Smart Watch',
+ 'Oraimo 2020 Edition Tempo S - OSW-11 Multi Function Smart Watch',
+ 'Oraimo 10000mAh Ultra Slim Fast Charging Power Bank',
+ 'Oraimo 20000MAH Powerbank - Power For Days']
+
",python
+"FacetGrid returned seaborn's relplot does not respect hueI am encountering a problem in which seaborn's relplot function creates a FacetGrid that is different from creating the FacetGrid manually. I find this unintuitive and would like the relplot function to give me a FacetGrid that behaves similarly to the manually created one.
+The issue is that when using the map function on the FacetGrid returned from relplot, it does not consider a specified hue anymore. Here is a minimal example that explains my point:
+import numpy as np
+import seaborn as sns
+import pandas as pd
+
+def foo(color=None, label=None, tag=None):
+ print(tag, color, label)
+
+x = np.random.randn(100)
+df = pd.DataFrame({
+ 'x' : x,
+ 'y' : 2 * x,
+ 'row' : np.random.randn(x.shape[0]) > 0,
+ 'col' : np.random.randn(x.shape[0]) > 0,
+ 'hue' : np.random.randn(x.shape[0]) > 0,
+})
+g = sns.relplot(data = df, x = 'x', y = 'y', row='row', col='col', hue='hue')
+g.map(foo, tag='relplot')
+
+g2 = sns.FacetGrid(data = df, row = 'row', col = 'col', hue='hue')
+g2.map(foo, tag='FacetGrid')
+
+When calling the map function on the facet grid returned by relplot, it will only be called four times (once per row and column) but will not respect the fact that I also specified a hue. The output is:
+relplot (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) None
+relplot (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) None
+relplot (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) None
+relplot (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) None
+
+If I map the same function to the FacetGrid that is manually created, it will result in the expected behaviour:
+FacetGrid (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) False
+FacetGrid (0.8666666666666667, 0.5176470588235295, 0.3215686274509804) True
+FacetGrid (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) False
+FacetGrid (0.8666666666666667, 0.5176470588235295, 0.3215686274509804) True
+FacetGrid (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) False
+FacetGrid (0.8666666666666667, 0.5176470588235295, 0.3215686274509804) True
+FacetGrid (0.2980392156862745, 0.4470588235294118, 0.6901960784313725) False
+FacetGrid (0.8666666666666667, 0.5176470588235295, 0.3215686274509804) True
+
+Is there any explanation as to why that happens? Is there a way to change relplot's behaviour to match the expected one, i.e. to respect the hue parameter I set?
","
+Is there any explanation as to why that happens?
+
+Yes, in relplot the hue logic is all handled within scatterplot, not by the FacetGrid.
+
+Is there a way to change relplot's behaviour to match the expected one?
+
+No, in this case you'll want to make your custom function handle the hue logic internally or start with FacetGrid directly.
",python
+"Getting back property of object in a query list SQLAlchemyI have a join table called UserServices. Which has a FK of service_id and makes a back ref to a service. Below I get all the userServices in which the id in the route param matches the user_id (another FK)
+I am then trying to access all the service properties on the all_user_services list.
+My current code only returns one dict instead of a list of dicts. What am i doing wrong?
+@bp.route('/user/<id>/services', methods=['GET'])
+def get_services_from_user(id):
+ all_user_services = db_session.query(UserService).filter(UserService.user_id == id).all()
+
+ for service in all_user_services:
+ result = service_schema.dump(service.service)
+ return jsonify(result)
+
","You just return on first for iteration. You need to create result list:
+dumped = [service_schema.dump(s.service) for s in all_user_services]
+return jsonify(dumped)
+
",python
+"big data in pytorch, help for tuning stepsI've previously splitted my bigdata:
+# X_train.shape : 4M samples x 2K features
+# X_test.shape : 2M samples x 2K features
+
+I've prepared the dataloaders
+target = torch.tensor(y_train.to_numpy())
+features = torch.tensor(X_train.values)
+train = data_utils.TensorDataset(features, target)
+train_loader = data_utils.DataLoader(train, batch_size=10000, shuffle=True)
+
+testtarget = torch.tensor(y_test.to_numpy())
+testfeatures = torch.tensor(X_test.values)
+test = data_utils.TensorDataset(testfeatures, testtarget)
+validation_generator = data_utils.DataLoader(test, batch_size=20000, shuffle=True)
+
+I copied from an online course this example for a network (no idea if other model are better)
+base_elastic_model = ElasticNet()
+param_grid = {'alpha':[0.1,1,5,10,50,100],
+ 'l1_ratio':[.1, .5, .7, .9, .95, .99, 1]}
+grid_model = GridSearchCV(estimator=base_elastic_model,
+ param_grid=param_grid,
+ scoring='neg_mean_squared_error',
+ cv=5,
+ verbose=0)
+
+I've built this fitting
+for epoch in range(1):
+ # Training
+ cont=0
+ total = 0
+ correct = 0
+ for local_batch, local_labels in train_loader:
+ cont+=1
+ with torch.set_grad_enabled(True):
+ grid_model.fit(local_batch,local_labels)
+ with torch.set_grad_enabled(False):
+ predicted = grid_model.predict(local_batch)
+ total += len(local_labels)
+ correct += ((1*(predicted>.5)) == np.array(local_labels)).sum()
+ #print stats
+
+ # Validation
+ total = 0
+ correct = 0
+
+ with torch.set_grad_enabled(False):
+ for local_batch, local_labels in validation_generator:
+ predicted = grid_model.predict(local_batch)
+ total += len(local_labels)
+ correct += ((1*(predicted>.5)) == np.array(local_labels)).sum()
+ #print stats
+
+Maybe my grandchildren will have the results for 1 epoch!
+I need some advises:
+
+- how/where (in the code) can I use quickly less data for a first tuning?
+- some advise for the steps to have a result in the 2022?
+- because I've added "with torch.set_grad_enabled(False):" for stats printing, have I to add (as done) "with torch.set_grad_enabled(True):" ?
+- I have got a GPU (useful without images??). I've the function "get_device()". Where have I to put ".to(get_device())" to use CUDA?
+- I'm learning putting together pieces of information, do you have general advising for my exercise?
+
","
+To shorten the training process by simply stopping the training for loop after a certain number like so.
+for local_batch, local_labels in train_loader:
+
+ cont+=1
+ if cont== number_u_want_to_stop:
+ break #Breaks out of the for Loop and continues with the rest.
+
+
+Always use your GPU for training and "inferencing" aka (using a model to make predictions) bs it is more than 20 faster than even the best CPU.
+
+No you don't have to make it true again. That's the main point of using the "with" syntax so after the code that is in the with the block is finished the properties will just dissolve into air :). So u can delete this line with a torch.set_grad_enabled(False):
+
+Like I said in the 2nd point use your GPU for all your projects but keep in mind u will have to use a graphics card with at least 4GB to train even little models.
+here the install cmd for using the GPU on windows:
+pip3 install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio===0.10.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
+and here is the one for Linux
+pip3 install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
+
+
+and here is a link to the PyTorch doc that explains to you how to use the GPU in PyTorch
+
+- A very nice starter project that probably everyone has done when he started with machine learning especially those who want to use computer vision. The Implementation of the image classification using the MNIST Dataset. There are many great tutorials out there. So at first, it will be very overwhelming with all those new words but I will promise it will get better when you start to speak the same language as the guys writing those tutorials. So first follow the tutorial and if u don't understand any word just google it by itself and work through it in little pieces bc otherwise, it will be very hard to comprehend. After u gained some basic knowledge u can start to build your own little projects. Start with something little. So keep grinding :)
+
",python
+"Dynamic attribute in a Python C moduleI have a custom Python module written in C, and I want to add an attribute to the module which is dynamically populated. E.g.:
+import mymod
+print(mymod.x) # At this point, the value of x is computed
+
+The name of the attribute is known in advance.
+From what I understand, this should be possible using descriptors, but it is not working as expected. I implemented a custom type, implemented the tp_descr_get function for the type, and assigned an instance of the type to my module, but the tp_descr_get function is never called.
+Here is my test module:
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#include <stdio.h>
+
+static struct PyModuleDef testmod = {
+ PyModuleDef_HEAD_INIT,
+ "testmod",
+ NULL,
+ -1
+};
+
+typedef struct testattrib_s {
+ PyObject_HEAD
+} testattrib;
+
+static PyObject *testattrib_descr_get(PyObject *self, PyObject *obj, PyObject *type);
+static int testattrib_descr_set(PyObject *self, PyObject *obj, PyObject *value);
+
+PyTypeObject testattribtype = {
+ PyVarObject_HEAD_INIT (NULL, 0)
+ "testattrib", /* tp_name */
+ sizeof (testattrib), /* tp_basicsize */
+ /* lots of zeros omitted for brevity */
+ testattrib_descr_get, /* tp_descr_get */
+ testattrib_descr_set /* tp_descr_set */
+};
+
+PyMODINIT_FUNC
+PyInit_testmod(void)
+{
+ if (PyType_Ready(&testattribtype)) {
+ return NULL;
+ }
+
+ testattrib *attrib = PyObject_New(testattrib, &testattribtype);
+ if (attrib == NULL) {
+ return NULL;
+ }
+
+ PyObject *m = PyModule_Create(&testmod);
+ if (m == NULL) {
+ return NULL;
+ }
+
+ if (PyModule_AddObject(m, "myattrib", (PyObject *) attrib)) {
+ return NULL;
+ }
+
+ return m;
+}
+
+static PyObject *testattrib_descr_get(PyObject *self, PyObject *obj, PyObject *type)
+{
+ printf("testattrib_descr_get called\n");
+ Py_INCREF(self);
+ return self;
+}
+
+static int testattrib_descr_set(PyObject *self, PyObject *obj, PyObject *value)
+{
+ printf("testattrib_descr_set called\n");
+ return 0;
+}
+
+I test it like this:
+import testmod
+
+print(testmod.myattrib) # should call tp_descr_get
+testmod.myattrib = 1 # should call tp_descr_set
+
+The getter/setter functions are never called. What am I doing wrong?
+I am running Python 3.8.5 on macOS 12.0.1 with a build from Anaconda:
+>>> sys.version
+'3.8.5 (default, Sep 4 2020, 02:22:02) \n[Clang 10.0.0 ]'
+
","Descriptors operate only as attributes on a type. You would have to create your module as an instance of a module subclass equipped with the descriptor. The easiest way to do that is to use the Py_mod_create slot (not to be confused with __slots__).
",python
+"Can't use csv pipelines and images pipelines within a spider correctlyI'm trying to figure out any way to write first two fields in a csv file and to use the last two fields to download images in a folder simultaneously. I've created two custom pipelines to achieve that.
+This is the spider:
+import scrapy
+
+class PagalWorldSpider(scrapy.Spider):
+ name = 'pagalworld'
+ start_urls = ['https://www.pagalworld.pw/indian-pop-mp3-songs-2021/files.html']
+
+ custom_settings = {
+ 'ITEM_PIPELINES': {
+ 'my_project.pipelines.PagalWorldImagePipeline': 1,
+ 'my_project.pipelines.CSVExportPipeline': 300
+ },
+ 'IMAGES_STORE': r"C:\Users\WCS\Desktop\Images",
+ }
+
+ def start_requests(self):
+ for start_url in self.start_urls:
+ yield scrapy.Request(start_url,callback=self.parse)
+
+ def parse(self, response):
+ for item in response.css(".files-list .listbox a[href]::attr(href)").getall():
+ inner_page_link = response.urljoin(item)
+ yield scrapy.Request(inner_page_link,callback=self.parse_download_links)
+
+ def parse_download_links(self,response):
+ title = response.css("h1.title::text").get()
+ categories = ', '.join(response.css("ul.breadcrumb > li > a::text").getall())
+
+ file_link = response.css(".file-details audio > source::attr(src)").get()
+ image_link = response.urljoin(response.css(".alb-img-det > img[data-src]::attr('data-src')").get())
+ image_name = file_link.split("-")[-1].strip().replace(" ","_").replace(".mp3","")
+
+ yield {"Title":title,"categories":categories,"image_urls":[image_link],"image_name":image_name}
+
+If I execute the script as is, I get all four fields in a csv file, the fields that I'm yielding within parse_download_links method. The script is also downloading and renaming images accurately.
+The first two fields Title and categories are what I wish to write to the csv file, not image_urls and image_name. However, this two fields image_urls and image_name are meant to download and rename images.
+How can I use both of the pipelines correctly?
","You don't have to create a CSV pipeline just for this purpose. Read this.
+import scrapy
+
+
+class PagalWorldSpider(scrapy.Spider):
+ name = 'pagalworld'
+ start_urls = ['https://www.pagalworld.pw/indian-pop-mp3-songs-2021/files.html']
+
+ custom_settings = {
+ 'ITEM_PIPELINES': {
+ 'my_project.pipelines.PagalWorldImagePipeline': 1,
+ # 'my_project.pipelines.CSVExportPipeline': 300
+ },
+ 'IMAGES_STORE': r'C:\Users\WCS\Desktop\Images',
+ 'FEEDS': {
+ r'file:///C:\Users\WCS\Desktop\output.csv': {'format': 'csv', 'overwrite': True}
+ },
+ 'FEED_EXPORT_FIELDS': ['Title', 'categories']
+ }
+
+ def start_requests(self):
+ for start_url in self.start_urls:
+ yield scrapy.Request(start_url, callback=self.parse)
+
+ def parse(self, response):
+ for item in response.css(".files-list .listbox a[href]::attr(href)").getall():
+ inner_page_link = response.urljoin(item)
+ yield scrapy.Request(inner_page_link, callback=self.parse_download_links)
+
+ def parse_download_links(self,response):
+ title = response.css("h1.title::text").get()
+ categories = ', '.join(response.css("ul.breadcrumb > li > a::text").getall())
+
+ file_link = response.css(".file-details audio > source::attr(src)").get()
+ image_link = response.urljoin(response.css(".alb-img-det > img[data-src]::attr('data-src')").get())
+ image_name = file_link.split("-")[-1].strip().replace(" ", "_").replace(".mp3", "")
+
+ yield {"Title": title, "categories": categories, "image_urls": [image_link], "image_name": image_name}
+
+Output:
+Heartfail - Mika Singh mp3 song Download PagalWorld.com,"Home, MUSIC, INDIPOP, Indian Pop Mp3 Songs 2021"
+Fakir - Hansraj Raghuwanshi mp3 song Download PagalWorld.com,"Home, MUSIC, INDIPOP, Indian Pop Mp3 Songs 2021"
+Humsafar - Suyyash Rai mp3 song Download PagalWorld.com,"Home, MUSIC, INDIPOP, Indian Pop Mp3 Songs 2021"
+...
+...
+...
+
+EDIT:
+main.py:
+from scrapy.crawler import CrawlerProcess
+from scrapy.utils.project import get_project_settings
+
+
+if __name__ == "__main__":
+ spider = 'pagalworld'
+ settings = get_project_settings()
+ settings['USER_AGENT'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
+ process = CrawlerProcess(settings)
+ process.crawl(spider)
+ process.start()
+
",python
+"Python - why the print result is repeated and ""write to a text"" only has one lineLovely people! I'm totally new with Python. I tried to scrape several URLs and encountered a problem with "print".
+I tried to print and write the "shipment status".
+I have two URLs, so ideally I get two results.
+This is my code:
+from bs4 import BeautifulSoup
+import re
+import urllib.request
+import urllib.error
+import urllib
+
+# read urls of websites from text file
+list_open = open("c:/Users/***/Downloads/web list.txt")
+read_list = list_open.read()
+line_in_list = read_list.split("\n")
+
+for url in line_in_list:
+ soup = BeautifulSoup(urllib.request.urlopen(url).read(), 'html')
+ # parse something special in the file
+ shipment = soup.find_all('span')
+ Preparation=shipment[0]
+ Sent=shipment[1]
+ InTransit=shipment[2]
+ Delivered=shipment[3]
+ for p in shipment:
+# extract information
+ print (url,';',"Preparation",Preparation.getText(),";","Sent",Sent.getText(),";","InTransit",InTransit.getText(),";","Delivered",Delivered.getText())
+
+import sys
+
+file_path = 'randomfile.txt'
+sys.stdout = open(file_path, "w")
+print(url,';',"Preparation",Preparation.getText(),";","Sent",Sent.getText(),";","InTransit",InTransit.getText(),";","Delivered",Delivered.getText())`
+
+I have two problems here:
+
+- Problem one: I have only two URLs, and when I print the results, every "span" is repeated 4 times (as there are four "span"s).
+The result in the "output" is as below:
+
+(I deleted the result example to protect privacy.)
+
+- Problem two: I tried to write the "print" to a text file, but only one line appeared in the file:
+
+(I deleted the result example to protect privacy.)
+I want to know what is wrong in the code. I want to print 2 url results only.
+Your help is really appreciated!
+Thank you in advance!
","First point is caused by iterating over shipment - Just delete the for loop and correct indent of print():
+for url in line_in_list:
+ soup = BeautifulSoup(urllib.request.urlopen(url).read(), 'html')
+ # parse something special in the file
+ shipment = soup.find_all('span')
+ Preparation=shipment[0]
+ Sent=shipment[1]
+ InTransit=shipment[2]
+ Delivered=shipment[3]
+
+ print (url,';',"Preparation",Preparation.getText(),";","Sent",Sent.getText(),";","InTransit",InTransit.getText(),";","Delivered",Delivered.getText())
+
+Second issue is caused while you call the writing outside the loop and not in append mode - You will end up with this as your loop:
+#open file in append mode
+with open('somefile.txt', 'a') as f:
+ #start iterating your urls
+ for url in line_in_list:
+ soup = BeautifulSoup(urllib.request.urlopen(url).read(), 'html')
+ # parse something special in the file
+ shipment = soup.find_all('span')
+ Preparation=shipment[0]
+ Sent=shipment[1]
+ InTransit=shipment[2]
+ Delivered=shipment[3]
+ #create output text
+ line = f'{url};Preparation{Preparation.getText()};Sent{Sent.getText()};InTransit{InTransit.getText()};Delivered{Delivered.getText()}'
+ #print output text
+ print (line)
+ #append output text to file
+ f.write(line+'\n')
+
+And you can delete:
+import sys
+file_path = 'randomfile.txt'
+sys.stdout = open(file_path, "w")
+print(url,';',"Preparation",Preparation.getText(),";","Sent",Sent.getText(),";","InTransit",InTransit.getText(),";","Delivered",Delivered.getText())`
+
+Example of a bit optimized code:
+from bs4 import BeautifulSoup
+import urllib.request
+import urllib.error
+import urllib
+
+# read urls of websites from text file
+list_open = open("c:/Users/***/Downloads/web list.txt")
+read_list = list_open.read()
+line_in_list = read_list.split("\n")
+file_path = "randomfile.txt"
+
+with open('somefile.txt', 'a', encoding='utf-8') as f:
+ for url in line_in_list:
+ soup = BeautifulSoup(urllib.request.urlopen(url).read(), 'html')
+ # parse something special in the file
+ shipment = list(soup.select_one('#progress').stripped_strings)
+ line = f"{url},{';'.join([':'.join(x) for x in list(zip(shipment[::2], shipment[1::2]))])}"
+ print (line)
+ f.write(line+'\n')
+
",python
+"Python : Get information from multiple tables sql, and fill DashI have multiple tables in a db SQL.
+I am trying to do Join so that I can have a several information from each table.
+The select works, the data also ( when I print it, It has data), but I cannot fill the table in Dash. It goes empty with the columns that I created
+Something is wrong... Thank you in advance
+conn = script.connect_to_db(host, user, psw, db_name, port)
+cursor = conn.cursor()
+sql_statement ="""SELECT li.book_name,au.autor_name,bo.references
+FROM library li
+INNER JOIN autors au ON au.id=li.id_autors
+INNER JOIN books bo ON bo.id_references=au.id
+"""
+cursor.execute(sql_statement)
+
+data=cursor.fetchall()
+print('data',data)
+
+columslist = [
+ {"name": ["book name"], "id": "book_name", "type": 'text'},
+ {"name": ["autors"], "id": "autors", "type": 'text'},
+ {"name": ["references"], "id": "references", "type": 'text'},
+
+]
+
+layout = html.Div([
+ html.Br(),
+ html.Div([dash_table.DataTable(
+ id='datatable-library',
+ columns=columslist ,
+ data=data,
+
","data should be in dictionry datatype :
+layout = html.Div([
+ html.Br(),
+ html.Div([dash_table.DataTable(
+ id='datatable-library',
+ columns=columslist ,
+ data=dict(data),
+
",python
+"How to normalize a JSON file into a Pandas dataframeI have a JSON file named stocks.json that looks as follows (note the lack of square brackets in the source file):
+{"MSFT": {"exchange": "Nasdaq", "price": 275.79}, "FB": {"exchange": "Nasdaq", "price": 320.22}, "TSLA": {"exchange": "Nasdaq", "price": 990.83}, "GE": {"exchange": "Nasdaq", "price": 83.20}}
+
+I would like to transform this data into a Pandas dataframe that looks as follows:
+symbol exchange price
+MSFT Nasdaq 275.79
+FB Nasdaq 320.22
+TSLA Nasdaq 990.83
+GE NYSE 83.20
+
+My attempt is:
+import pandas as pd
+
+stock_data = pd.read_json('stocks.json', lines=True)
+stock_data_normalized = pd.json_normalize(stock_data)
+
+Unfortunately, I get the following when calling stock_data_normalized:
+0
+1
+2
+3
+
+Any assistance would be most appreciated. Thanks!
","You can just using the pd.DataFrame() constructor, and then transpose and reset the index:
+df = pd.DataFrame(d).T.reset_index().rename({'index': 'symbol'}, axis=1)
+
+Output:
+>>> df
+ symbol exchange price
+0 MSFT Nasdaq 275.79
+1 FB Nasdaq 320.22
+2 TSLA Nasdaq 990.83
+3 GE Nasdaq 83.2
+
",python
+"Connecting to random points in a 2d numpy array based on distanceI have a 2d numpy array and select a random coordinate position (say 10x10 array and start at position 2,3). I want to randomly connect to 40% of the other points in the 2d array effectively generating a list of tuples [(x1, y1), (x2, y2) ...] where the list is 40% of the other coordinates.
+An additional constraint, however, is the goal is to reduce connection probability the farther the points are away from one another (so point 2,3 is far more likely to connect to 2,2 than 9, 8 but yet still be random so there is a chance albeit small of connecting to 9, 8).
+I believe I need to create some sort of Guassian function centered on 2,3 and use this to select the points, but any Gaussian I create will generate non-integer values - requiring additional logic as well as presents problem of needing to handle x and y dimensions separately.
+Currently, I am trying to use np.meshgrid with
+gauss = np.exp(-(dst2 / (2.0 * sigma2)))
+Is there an easier way to do this or a different approach someone might recommend?
","This problem is well suited for rejection sampling.
+Basically you randomly choose a point, and select if a connection should be made based on a defined probability. You have to take in account that there are many more points at further distance than at # closer distances (its number grows with radius), so maybe you have to introduce an extra weighing in the probability function. In this case I choose to use an exponential decay probability.
+This code is not optimal in terms of speed, particularly for higher connectivity percents, but the ideas are better presented this way: see below for a better option.
+import numpy as np
+from numpy.random import default_rng
+
+rng = default_rng()
+board = np.zeros((100, 100), dtype=bool)
+percent_connected = 4
+N_points = round((board.size - 1) * percent_connected/100)
+center = np.array((20, 30))
+board[tuple(center)] = True # remove the center point from the pool
+dist_char = 35 # characteristic distance where probability decays to 1/e
+
+endpoints = []
+while N_points:
+ point = rng.integers(board.shape)
+ if not board[tuple(point)]:
+ dist = np.sqrt(np.sum((center-point)**2))
+ P = np.exp(-dist / dist_char)
+ if rng.random() < P:
+ board[tuple(point)] = True
+ endpoints.append(point)
+ N_points -= 1
+board[tuple(center)] = False # clear the center point
+
+# Graphical test
+import matplotlib.pyplot as plt
+
+plt.figure()
+for ep in endpoints:
+ plt.plot(*zip(center, ep), c="blue")
+
+
+A slightly faster approach is much faster at higher connectivity:
+rng = default_rng()
+board = np.zeros((100, 100), dtype=bool)
+percent_connected = 4
+N_points = round((board.size - 1) * percent_connected/100)
+center = np.array((20, 30))
+board[tuple(center)] = True # remove the center point from the pool
+dist_char = 35 # characteristic distance where probability decays to 1/e
+flat_board = board.ravel()
+endpoints = []
+while N_points:
+ idx = rng.integers(flat_board.size)
+ while flat_board[idx]:
+ idx += 1
+ if idx >= flat_board.size:
+ idx = 0
+ if not flat_board[idx]:
+ point = np.array((idx // board.shape[0], idx % board.shape[0]))
+ dist = np.sqrt(np.sum((center-point)**2))
+ P = np.exp(-dist / dist_char)
+ if rng.random() < P:
+ flat_board[idx] = True
+ endpoints.append(point)
+ N_points -= 1
+board[tuple(center)] = False # clear the center point
+
+
+plt.figure()
+for ep in endpoints:
+ plt.plot(*zip(center, ep), c="blue")
+
",python
+"Get indexes of Pandas Rolling windowI would like to get the indexes of the elements in each rolling window of a Pandas Series.
+A solution that works for me is from this answer to an existing question: I get the window.index for each window obtained from the rolling function described in the answer. I am only interested in step=1 for the aforementioned function.
+But this function is not specific for DataFrames and Series, it would work on basic Python lists.
+Isn't there some functionality that takes advantage of Pandas rolling operations?
+I tried the Rolling.apply method:
+s = pd.Series([1, 2, 3, 4, 5, 6, 7])
+
+rolling = s.rolling(window=3)
+indexes = rolling.apply(lambda x: x.index)
+
+But it result in a TypeError: must be real number, not RangeIndex. Apparently, the Rolling.apply method only accepts functions that return a number based on each window. The functions cannot return other kinds of objects.
+Are there other methods of the Pandas Rolling class I could use? Even private methods.
+Or are there any other Pandas-specific functionalities to get the indexes of overlapping rolling windows?
+Expected output
+As output, I expect some kind of list-of-lists object. Each inner list should countain the index values of each window.
+The original s Series has [0, 1, 2, 3, 4, 5, 6] as index.
+So, rolling with a window=3, I expect as outcome something like:
+[
+ [0, 1, 2],
+ [1, 2, 3],
+ [2, 3, 4],
+ [3, 4, 5],
+ [4, 5, 6],
+]
+
","The apply function after rolling must return a numeric value for each window. One possible workaround is to use a list comprehension to iterate over each window and apply the custom transformation as required:
+[[*l.index] for l in s.rolling(3) if len(l) == 3]
+
+Alternatively you can also use sliding_window_view to accomplish the same:
+np.lib.stride_tricks.sliding_window_view(s.index, 3)
+
+Or even an list comprehension would do the job just fine:
+w = 3
+[[*s.index[i : i + w]] for i in range(len(s) - w + 1)]
+
+Result
+[[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]]
+
",python
+"having trouble with function and asynciohere is my Code so far:
+import discord
+from discord import Webhook, AsyncWebhookAdapter
+from discord.ext import commands
+from discord import Activity, ActivityType
+import aiohttp
+from bs4 import BeautifulSoup
+from requests_html import AsyncHTMLSession
+intents = discord.Intents.default()
+intents.members = True
+
+client = commands.Bot(command_prefix="$", intents=intents, case_insensitive=True)
+
+async def amazon():
+ URL = "https://www.amazon.com/s?k=gaming"
+
+ with AsyncHTMLSession() as session:
+ response = await session.get(URL)
+ response.html.arender(timeout=20)
+
+ soup = BeautifulSoup(response.html.html, "lxml")
+ results = soup.select("a.a-size-base.a-link-normal.s-link-style.a-text-normal")
+
+ max_price = 10
+
+ for result in results:
+ price = result.text.split('$')[1].replace(",", "")
+ if float(price) < max_price:
+ print(f"Price: ${price}\nLink: https://www.amazon.com{result['href'].split('?')[0]}")
+
+
+
+@client.command()
+async def amaz(ctx):
+ await amazon()
+ await ctx.send("hello")
+
+
+
+client.run("iputmytokenhere")
+
+
+here is the error I get when doing $amaz:
+RuntimeWarning: coroutine 'HTML.arender' was never awaited
+ response.html.arender(timeout=20)
+RuntimeWarning: Enable tracemalloc to get the object allocation traceback
+C:\Users\CK\AppData\Roaming\Python\Python37\site-packages\requests\sessions.py:428: RuntimeWarning: coroutine 'AsyncHTMLSession.close' was never awaited
+ self.close()
+RuntimeWarning: Enable tracemalloc to get the object allocation traceback
+
+I am using this as a fun project, any help is greatly appreciated. I tried many things but nothing seems to be working. I want the bot to send the scraped data to a discord webhook.
","+You can use the following code for the amazon() function.
+async def amazon():
+ URL = "https://www.amazon.com/s?k=gaming"
+ # add the string of the webhook url
+ WEBHOOK_URL = "https://your_webhook_url"
+
+ # change the with statement to assignment following the documentation
+ session = AsyncHTMLSession()
+ response = await session.get(URL)
+ # add await to prevent the "was not awaited" error
+ await response.html.arender(timeout=20)
+ # create the webhook object
+ webhook = Webhook.from_url(WEBHOOK_URL, adapter=AsyncWebhookAdapter(session))
+
+ soup = BeautifulSoup(response.html.html, "lxml")
+ results = soup.select("a.a-size-base.a-link-normal.s-link-style.a-text-normal")
+
+ max_price = 10
+
+ for result in results:
+ price = result.text.split('$')[1].replace(",", "")
+ if float(price) < max_price:
+ # change print() to webhook.send() to send the data from a webhook
+ await webhook.send(f"Price: ${price}\nLink: https://www.amazon.com{result['href'].split('?')[0]}")
+
",python
+"How to create and annotate a stacked proportional bar chartI'm struggling to create a stacked bar chart derived from value_counts() of a columns from a dataframe.
+Assume a dataframe like the following, where responder is not important, but would like to stack the count of [1,2,3,4,5] for all q# columns.
+responder, q1, q2, q3, q4, q5
+------------------------------
+r1, 5, 3, 2, 4, 1
+r2, 3, 5, 1, 4, 2
+r3, 2, 1, 3, 4, 5
+r4, 1, 4, 5, 3, 2
+r5, 1, 2, 5, 3, 4
+r6, 2, 3, 4, 5, 1
+r7, 4, 3, 2, 1, 5
+
+Look something like, except each bar would be labled by q# and it would include 5 sections for count of [1,2,3,4,5] from the data:
+![]()
+Ideally, all bars will be "100%" wide, showing the count as a proportion of the bar. But it's gauranteed that each responder row will have one entry for each, so the percentage is just a bonus if possible.
+Any help would be much appreciated, with a slight preference for matplotlib solution.
","You can calculate the heights of bars using percentages and obtain the stacked bar plot using ax = percents.T.plot(kind='barh', stacked=True) where percents is a DataFrame with q1,...q5 as columns and 1,...,5 as indices.
+>>> percents
+ q1 q2 q3 q4 q5
+1 0.196873 0.199316 0.206644 0.194919 0.202247
+2 0.205357 0.188988 0.205357 0.205357 0.194940
+3 0.202265 0.217705 0.184766 0.196089 0.199177
+4 0.199494 0.199494 0.190886 0.198481 0.211646
+5 0.196137 0.195146 0.211491 0.205052 0.192174
+
+Then you can use ax.patches to add labels for every bar. Labels can be generated from the original counts DataFrame: counts = df.apply(lambda x: x.value_counts())
+>>> counts
+ q1 q2 q3 q4 q5
+1 403 408 423 399 414
+2 414 381 414 414 393
+3 393 423 359 381 387
+4 394 394 377 392 418
+5 396 394 427 414 388
+
+
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+
+## create some data similar to yours
+np.random.seed(42)
+categories = ['q1','q2','q3','q4','q5']
+df = pd.DataFrame(np.random.randint(1,6,size=(2000, 5)), columns=categories)
+
+## counts will be used for the labels
+counts = df.apply(lambda x: x.value_counts())
+
+## percents will be used to determine the height of each bar
+percents = counts.div(counts.sum(axis=1), axis=0)
+
+counts_array = counts.values
+nrows, ncols = counts_array.shape
+indices = [(i,j) for i in range(0,nrows) for j in range(0,ncols)]
+percents_array = percents.values
+
+ax = percents.T.plot(kind='barh', stacked=True)
+ax.legend(bbox_to_anchor=(1, 1.01), loc='upper right')
+for i, p in enumerate(ax.patches):
+ ax.annotate(f"({p.get_width():.2f}%)", (p.get_x() + p.get_width() - 0.15, p.get_y() - 0.10), xytext=(5, 10), textcoords='offset points')
+ ax.annotate(str(counts_array[indices[i]]), (p.get_x() + p.get_width() - 0.15, p.get_y() + 0.10), xytext=(5, 10), textcoords='offset points')
+plt.show()
+
+![]()
",python
+"How to make middleware object callable in Django 2.2I'm trying to update a django/mezzanine application from python 2.7 to python 3.7. Can you help me in fixing the error below (CTypeError: 'CheckNewsDateStatus' object is not callable)?
+Seems that this class is not used at all; if I grep through all code only the attached settings.py and middleware.py match. Is it something partly implemented in django/mezzanine or it it so that the whole class can be removed as unnecessary ?
+I don't know how the code was planned to work and I don't know is the feature has been used at all... and I don't know how the callable objects should be presented in settings.py-file.
+(python-3.7) miettinj@ramen:~/pika-py-3.7/pika> python manage.py runserver 0:8034
+BASE_DIR /srv/work/miettinj/pika-py-3.7/pika
+PROJECT_ROOT /srv/work/miettinj/pika-py-3.7/pika/pika
+/srv/work/miettinj/python-3.7/lib/python3.7/site-packages/mezzanine/utils/timezone.py:13: PytzUsageWarning: The zone attribute is specific to pytz's interface; please migrate to a new time zone provider.
+/srv/work/miettinj/python-3.7/lib/python3.7/site-packages/mezzanine/utils/conf.py:67: UserWarning: TIME_ZONE setting is not set, using closest match: Europe/Helsinki
+ warn("TIME_ZONE setting is not set, using closest match: %s" % tz)
+BASE_DIR /srv/work/miettinj/pika-py-3.7/pika
+PROJECT_ROOT /srv/work/miettinj/pika-py-3.7/pika/pika
+/srv/work/miettinj/python-3.7/lib/python3.7/site-packages/mezzanine/utils/timezone.py:13: PytzUsageWarning: The zone attribute is specific to pytz's interface; please migrate to a new time zone provider.
+/srv/work/miettinj/python-3.7/lib/python3.7/site-packages/mezzanine/utils/conf.py:67: UserWarning: TIME_ZONE setting is not set, using closest match: Europe/Helsinki
+ warn("TIME_ZONE setting is not set, using closest match: %s" % tz)
+Watching for file changes with StatReloader
+ .....
+ _d^^^^^^^^^b_
+ .d'' ``b.
+ .p' `q.
+ .d' `b.
+ .d' `b. * Mezzanine 5.0.0
+ :: :: * Django 2.2
+ :: M E Z Z A N I N E :: * Python 3.7.10
+ :: :: * PostgreSQL 9.3.0
+ `p. .q' * Linux 5.3.18-lp152.102-default
+ `p. .q'
+ `b. .d'
+ `q.. ..p'
+ ^q........p^
+ ''''
+
+Performing system checks...
+
+System check identified no issues (0 silenced).
+December 31, 2021 - 14:08:56
+Django version 2.2, using settings 'pika.settings'
+Starting development server at http://0:8034/
+Quit the server with CONTROL-C.
+/srv/work/miettinj/python-3.7/lib/python3.7/site-packages/django/core/handlers/base.py:37:
+/srv/work/miettinj/python-3.7/lib/python3.7/site-packages/django/core/handlers/base.py:37: FutureWarning: `TemplateForDeviceMiddleware` is deprecated. Please remove it from your middleware settings.
+ mw_instance = middleware(handler)
+Internal Server Error: /admin/
+Traceback (most recent call last):
+ File "/srv/work/miettinj/python-3.7/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
+ response = get_response(request)
+TypeError: 'CheckNewsDateStatus' object is not callable
+
+middlewary.py:
+class CheckNewsDateStatus:
+ def __init__(self, get_response):
+ self.get_response = get_response
+
+ def process_request(self, request):
+ if '/uutinen/' in request.path:
+ try:
+ path_to_go_raw = request.path
+ true_slug = path_to_go_raw.split('/uutinen/')[1:]
+ news_obj = Uutinen.objects.get(slug=true_slug[0])
+ now_utc = pytz.utc.localize(datetime.now())
+
+ hel = pytz.timezone("Europe/Helsinki")
+ foo = news_obj.publish_date.astimezone(hel)
+
+ if foo.date() < now_utc.date() and news_obj.status == 2:
+ pass
+ elif foo.date() == now_utc.date() and foo.time() < now_utc.time() and news_obj.status == 2:
+ pass
+ else:
+ print("authenticated-->", request.user.is_authenticated())
+ if request.user.is_authenticated():
+ pass
+ elif news_obj.status == 1:
+ return HttpResponseNotFound('404')
+ else:
+ return HttpResponseNotFound('404')
+ except:
+ pass
+
+
+
+settings.py:
+ # List of middleware classes to use. Order is important; in the request phase,
+ # these middleware classes will be applied in the order given, and in the
+ # response phase the middleware will be applied in reverse order.
+ MIDDLEWARE = (
+ 'page_types.middleware.CheckNewsDateStatus',
+ 'page_types.middleware.SetDynamicSite',
+ # 'debug_toolbar.middleware.DebugToolbarMiddleware',
+ "mezzanine.core.middleware.UpdateCacheMiddleware",
+ "django.contrib.sessions.middleware.SessionMiddleware",
+ #"django.middleware.locale.LocaleMiddleware",
+ "statfi_search.middleware.locale.LocaleMiddleware",
+ "django.contrib.auth.middleware.AuthenticationMiddleware",
+ "django.middleware.common.CommonMiddleware",
+ "django.middleware.csrf.CsrfViewMiddleware",
+ "django.contrib.messages.middleware.MessageMiddleware",
+ "mezzanine.core.request.CurrentRequestMiddleware",
+ "mezzanine.core.middleware.RedirectFallbackMiddleware",
+ "mezzanine.core.middleware.TemplateForDeviceMiddleware",
+ "mezzanine.core.middleware.TemplateForHostMiddleware",
+ "mezzanine.core.middleware.AdminLoginInterfaceSelectorMiddleware",
+ "mezzanine.core.middleware.SitePermissionMiddleware",
+ # Uncomment the following if using any of the SSL settings:
+ # "mezzanine.core.middleware.SSLRedirectMiddleware",
+ "mezzanine.pages.middleware.PageMiddleware",
+ "mezzanine.core.middleware.FetchFromCacheMiddleware",
+ 'page_types.middleware.RedirectMiddleware',
+ )
+
","You have an old style mixin here, you need to inherit from MiddlewareMixin
+Change your code so that CheckNewsDateStatus object inherits from MiddlewareMixin like this
+# this will probably be in page_types/middleware.py file
+class CheckNewsDateStatus(MiddlewareMixin):
+ def __init__(self, get_response):
+ self.get_response = get_response
+
+ def process_request(self, request):
+ if '/uutinen/' in request.path:
+ # rest of code
+
",python
+"How to compare two schema in Databricks notebook in pythonI'm going to ingest data using databricks notebook. I want to validate the schema of the data ingested against what I'm expecting the schema of these data to be.
+So basically I have:
+ validation_schema = StructType([
+ StructField("a", StringType(), True),
+ StructField("b", IntegerType(), False),
+ StructField("c", StringType(), False),
+ StructField("d", StringType(), False)
+ ])
+
+ data_ingested_good = [("foo",1,"blabla","36636"),
+ ("foo",2,"booboo","40288"),
+ ("bar",3,"fafa","42114"),
+ ("bar",4,"jojo","39192"),
+ ("baz",5,"jiji","32432")
+ ]
+
+ data_ingested_bad = [("foo","1","blabla","36636"),
+ ("foo","2","booboo","40288"),
+ ("bar","3","fafa","42114"),
+ ("bar","4","jojo","39192"),
+ ("baz","5","jiji","32432")
+ ]
+
+ data_ingested_good.printSchema()
+ data_ingested_bad.printSchema()
+ validation_schema.printSchema()
+
+I've seen similar questions but answers are always in scala.
","it's really depends on your exact requirements & complexities of schemas that you want to compare - for example, ignore nullability flag vs. taking it into account, order of columns, support for maps/structs/arrays, etc. Also, do you want to see difference or just a flag if schemas are matching or not.
+In the simplest case it could be as simple as following - just compare string representations of schemas:
+def compare_schemas(df1, df2):
+ return df1.schema.simpleString() == df2.schema.simpleString()
+
+I personally would recommend to take an existing library, like Chispa that has more advanced schema comparison functions - you can tune checks, it will show differences, etc. After installation (you can just do %pip install chispa) - this will throw an exception if schemas are different:
+from chispa.schema_comparer import assert_schema_equality
+
+assert_schema_equality(df1.schema, df2.schema)
+
",python
+"Adding data to an existing JSON file without overwriting itThe idea:
+
I want to add a JSON object to an existing JSON file, but not overwrite the existing data in the file.
+
The uid-003 object should come subordinate to uID after the existing uid-xxx entries.
+The problem:
+
No solution approach works as it should. The append() approach also returns the error: AttributeError: 'dict' object has no attribute 'append'.
+The current code
+
Python code:
+
+user = {
+ "uid-003": {
+ "username": "username-3",
+ "pinned": "pinned",
+ "created": {
+ "date": "DD/MM/YYYY",
+ "time": "HH:MM:SS"
+ },
+ "verified": {
+ "checked": False
+ }
+ }
+}
+
+with open("path/to/json", "r+") as file:
+ data = json.load(file)
+
+ temp = data['uID']
+ temp.append(user)
+
+ json.dump(data, file)
+
+JSON file:
+{
+ "uID": {
+ "uid-001": {
+ "username": "username-1",
+ "pinned": false,
+ "created": {
+ "date": "20-12-2021",
+ "time": "21:13:39"
+ },
+ "verified": {
+ "checked": false
+ }
+ },
+ "uid-002": {
+ "username": "username-2",
+ "pinned": true,
+ "created": {
+ "date": "20-12-2021",
+ "time": "21:13:39"
+ },
+ "verified": {
+ "checked": false
+ }
+ }
+ }
+}
+
","All you need to do is add your user dictionary to the existing 'uID' key. If you update a file that's been opened with r+ and you're increasing the amount of data in the file, you'll need to seek to the beginning before writing. This should help:
+import json
+
+THE_FILE = 'the_file.json'
+
+user = {
+ "uid-003": {
+ "username": "username-3",
+ "pinned": "pinned",
+ "created": {
+ "date": "DD/MM/YYYY",
+ "time": "HH:MM:SS"
+ },
+ "verified": {
+ "checked": False
+ }
+ }
+}
+
+with open(THE_FILE, 'r+') as jfile:
+ j = json.load(jfile)
+ for k, v in user.items():
+ j['uID'][k] = v
+ jfile.seek(0)
+ json.dump(j, jfile, indent=4)
+
+Note: The iteration over user.items() isn't really necessary in this case but serves to show how you might use this pattern when user contains more dictionaries than just the one in your example
",python
+"ValueError: time data does not match format '%y/%m/%d'I am trying to convert a string to a DateTime object but an error occurs.
+This is my code:
+from datetime import datetime
+
+string = "2021/12/18"
+
+final_check_date = datetime.strptime(string, '%y/%m/%d')
+
+print(final_check_date)
+
+And the error is: ValueError: time data '2021/12/18' does not match format '%y/%m/%d'
","It is just a format error. I have attached the code with its solution. a date string '2021-12-18' is parsed by this function if the format given is '%Y-%m-%d'.
+Code with the solution
",python
+"Why is Colab still running Python 3.7?I saw on this tweet that Google Colab move to Python 3.7 on February 2021. As of today however (January 2022), Python 3.10 is out, but Colab still runs Python 3.7.
+My (voluntarily) naive take is that this is quite a significant lag.
+Why are they not at least on Python 3.8 or even 3.9?
+Is it simply to make sure that some compatibility criteria are met?
","The only reason is they want to have the most compatible version of Python worldwide. Based on the Python Readiness report (Python 3.7 Readiness), version 3.7 supports almost 80.6% of the most used packages so far. Still, this coverage is 78.3% for version 3.8, 70.6% for version 3.9 and 49.7% for version 3.10 (as of March 29, 2022).
+Frankly, if Python 3.6 was not in its EOL, they still used this version today. Lucky us, which python.org decide to rid of versions below 3.7.
+On the other hand, You can update the Python version in your Colab by running some Linux commands in the notebook. But the problem is that whenever you start a new notebook, google ignores the updates and will turn back to the original version.
+The best action that google can have is to have options to select the python version. Because of this, I am not using Colab in most cases, especially when I am teaching Python to my students.
",python
+"FastAPI not loading static filesSo, I'm swapping my project from node.js to python FastAPI. Everything has been working fine with node, but here it says that my static files are not present, so here's the code:
+from fastapi import FastAPI, Request, WebSocket
+from fastapi.responses import HTMLResponse
+from fastapi.staticfiles import StaticFiles
+from fastapi.templating import Jinja2Templates
+
+app = FastAPI()
+
+app.mount("/static", StaticFiles(directory="../static"), name="static")
+templates = Jinja2Templates(directory='../templates')
+
+@app.get('/')
+async def index_loader(request: Request):
+ return templates.TemplateResponse('index.html', {"request": request})
+
+The project's structure looks like this:
+![]()
+Files clearly are where they should be, but when I connect to website the following error is raised:
+←[32mINFO←[0m: connection closed
+←[32mINFO←[0m: 127.0.0.1:54295 - "←[1mGET /img/separator.png HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+←[32mINFO←[0m: 127.0.0.1:54296 - "←[1mGET /css/rajdhani.css HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+←[32mINFO←[0m: 127.0.0.1:54295 - "←[1mGET /js/pixi.min.js HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+←[32mINFO←[0m: 127.0.0.1:54296 - "←[1mGET /js/ease.js HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+←[32mINFO←[0m: 127.0.0.1:54298 - "←[1mGET / HTTP/1.1←[0m" ←[32m200 OK←[0m
+←[32mINFO←[0m: 127.0.0.1:54298 - "←[1mGET /img/separator.png HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+←[32mINFO←[0m: 127.0.0.1:54299 - "←[1mGET /css/rajdhani.css HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+←[32mINFO←[0m: 127.0.0.1:54298 - "←[1mGET /js/pixi.min.js HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+←[32mINFO←[0m: 127.0.0.1:54299 - "←[1mGET /js/ease.js HTTP/1.1←[0m" ←[31m404 Not Found←[0m
+
+So, basically, any static file that I'm using is missing, and I have no idea what am I doing wrong. How to fix it?
","Here:
+app.mount("/static", StaticFiles(directory="../static"), name="static")
+
+You mount your static directory under /static path. That means, if you want access static files in your html you need to use static prefix, e.g. <img src="static/img/separator.png"/>
",python
+"Parquet with null columns on PyarrowI'm reading a table on PostgreSQL using pandas.read_sql, then I'm converting it as a Pyarrow table and saving it partitioned in local filesystem.
+# Retrieve schema.table data from database
+def basename_file(date_partition):
+ basename_file = f"{table_schema}.{table_name}-{date}.parquet"
+ return basename_file
+
+def get_table_data(table_schema, table_name, date):
+ s = ""
+ s += "SELECT"
+ s += " *"
+ s += " , date(created_on) as date_partition"
+ s += " FROM {table_schema}.{table_name}"
+ s += " WHERE created_on = '{date}';"
+ sql = s.format(table_schema = table_schema, table_name = table_name, date = date)
+# print(sql)
+
+ df = pd.read_sql(sql, db_conn)
+ result = pa.Table.from_pandas(df)
+ pq.write_to_dataset(result,
+ root_path = f"{dir_name}",
+ partition_cols = ['date_partition'],
+ partition_filename_cb = basename_file,
+ use_legacy_dataset = True
+ )
+# print(result)
+ return df
+
+Problem is that my SELECT has a column with some rows as null.
+When I partition this to write (write_to_dataset) in local filesystem, a few files has only rows with that column as null, so the partitioned Parquet files doesn't have this column.
+When I try to read that by multiple partitions, I get a schema error, because one of the columns cannot be casted correctly.
+Why is that? Is there any setting I could apply to write_to_dataset to manage this?
+I've been looking for a workaround for this without success...
+My main goal here is to export data daily, partitioned by transaction date and read data from any period needed, not caring about schema evolution: that way, row value for null columns will appear as null, simply put.
","If you can post the exact error message that might be more helpful. I did some experiments with pyarrow 6.0.1 and I found that things work ok as long as the first file contains some valid values for all columns (pyarrow will use this first file to infer the schema for the entire dataset).
+The "first" file is not technically well defined when doing dataset discovery but, at the moment, for a local dataset it should be the first file in alphabetical order.
+If the first file does not have values for all columns then I get the following error:
+Error: Unsupported cast from string to null using function cast_null
+I'm a bit surprised as this sort of cast should be pretty easy (to cast to null just throw away all the data). That being said, you probably don't want all your data thrown away anyways.
+The easiest solution is to provide the full expected schema when you are creating your dataset. If you do not know this ahead of time you can figure it out yourself by inspecting all of the files in the dataset and using pyarrow's unify_schemas. I have an example of doing this in this answer.
+Here is some code demonstrating my findings:
+import os
+
+import pyarrow as pa
+import pyarrow.parquet as pq
+import pyarrow.dataset as ds
+
+tab = pa.Table.from_pydict({'x': [1, 2, 3], 'y': [None, None, None]})
+tab2 = pa.Table.from_pydict({'x': [4, 5, 6], 'y': ['x', 'y', 'z']})
+
+os.makedirs('/tmp/null_first_dataset', exist_ok=True)
+pq.write_table(tab, '/tmp/null_first_dataset/0.parquet')
+pq.write_table(tab2, '/tmp/null_first_dataset/1.parquet')
+
+os.makedirs('/tmp/null_second_dataset', exist_ok=True)
+pq.write_table(tab, '/tmp/null_second_dataset/1.parquet')
+pq.write_table(tab2, '/tmp/null_second_dataset/0.parquet')
+
+try:
+ dataset = ds.dataset('/tmp/null_first_dataset')
+ tab = dataset.to_table()
+ print(f'Was able to read in null_first_dataset without schema.')
+ print(tab)
+except Exception as ex:
+ print('Was not able to read in null_first_dataset without schema')
+ print(f' Error: {ex}')
+print()
+
+try:
+ dataset = ds.dataset('/tmp/null_second_dataset')
+ tab = dataset.to_table()
+ print(f'Was able to read in null_second_dataset without schema.')
+ print(tab)
+except:
+ print('Was not able to read in null_second_dataset without schema')
+ print(f' Error: {ex}')
+print()
+
+dataset = ds.dataset('/tmp/null_first_dataset', schema=tab2.schema)
+tab = dataset.to_table()
+print(f'Was able to read in null_first_dataset by specifying schema.')
+print(tab)
+
",python
+"fastapi get header parameter is none in basemodel use Header cant get token but in function its ok why?in basemodel use Header cant get token but in function its ok why?
+ class Test(BaseModel):
+ name: str
+ token: str= Header(None)
+
+def login(t: Test):
+ print(t.dict())
+ return 'test'
+
+output {'name': '123', 'token': None}
+
+if i do this is ok
+def login(t: Test,token: str= Header(None)):
+ print(t.dict(),token)
+ return 'test'
+
+ output {'name': '123', 'token': None} 123456
+
+who can help me plz !
","The Pydantic model here represents the validation of the request payload.
+The values mentioned in the "Test" model should be available in request payload. Since the header is not the payload, we cannot use that in Basemodel.
+Yet you can use that in api function. When you call the api, then the request params can be used in the function. So, it is possible to use this
",python
+"How to load in graph from networkx into PyTorch geometric and set node features and labels?Goal: I am trying to import a graph FROM networkx into PyTorch geometric and set labels and node features.
+(This is in Python)
+Question(s):
+
+- How do I do this [the conversion from networkx to PyTorch geometric]? (presumably by using the
from_networkx function)
+- How do I transfer over node features and labels? (more important question)
+
+I have seen some other/previous posts with this question but they weren't answered (correct me if I am wrong).
+Attempt: (I have just used an unrealistic example below, as I cannot post anything real on here)
+Let us imagine we are trying to do a graph learning task (e.g. node classification) on a group of cars (not very realistic as I said). That is, we have a group of cars, an adjacency matrix, and some features (e.g. price at the end of the year). We want to predict the node label (i.e. brand of the car).
+I will be using the following adjacency matrix: (apologies, cannot use latex to format this)
+A = [(0, 1, 0, 1, 1), (1, 0, 1, 1, 0), (0, 1, 0, 0, 1), (1, 1, 0, 0, 0), (1, 0, 1, 0, 0)]
+Here is the code (for Google Colab environment):
+import pandas as pd
+import numpy as np
+import matplotlib.pyplot as plt
+import networkx as nx
+from torch_geometric.utils.convert import to_networkx, from_networkx
+import torch
+
+!pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.10.0+cpu.html
+
+# Make the networkx graph
+G = nx.Graph()
+
+# Add some cars (just do 4 for now)
+G.add_nodes_from([
+ (1, {'Brand': 'Ford'}),
+ (2, {'Brand': 'Audi'}),
+ (3, {'Brand': 'BMW'}),
+ (4, {'Brand': 'Peugot'}),
+ (5, {'Brand': 'Lexus'}),
+])
+
+# Add some edges
+G.add_edges_from([
+ (1, 2), (1, 4), (1, 5),
+ (2, 3), (2, 4),
+ (3, 2), (3, 5),
+ (4, 1), (4, 2),
+ (5, 1), (5, 3)
+])
+
+# Convert the graph into PyTorch geometric
+pyg_graph = from_networkx(G)
+
+So this correctly converts the networkx graph to PyTorch Geometric. However, I still don't know how to properly set the labels.
+The brand values for each node have been converted and are stored within:
+pyg_graph.Brand
+
+Below, I have just made some random numpy arrays of length 5 for each node (just pretend that these are realistic).
+ford_prices = np.random.randint(100, size = 5)
+lexus_prices = np.random.randint(100, size = 5)
+audi_prices = np.random.randint(100, size = 5)
+bmw_prices = np.random.randint(100, size = 5)
+peugot_prices = np.random.randint(100, size = 5)
+
+This brings me to the main question:
+
+- How do I set the prices to be the node features of this graph?
+- How do I set the labels of the nodes? (and will I need to remove the labels from
pyg_graph.Brand when training the network?)
+
+Thanks in advance and happy holidays.
","The easiest way is to add all information to the networkx graph and directly create it in the way you need it. I guess you want to use some Graph Neural Networks. Then you want to have something like below.
+
+- Instead of text as labels, you probably want to have a categorial representation, e.g. 1 stands for Ford.
+- If you want to match the "usual convention". Then you name your input features
x and your labels/ground truth y.
+- The splitting of the data into train and test is done via mask. So the graph still contains all information, but only part of it is used for training. Check the
PyTorch Geometric introduction for an example, which uses the Cora dataset.
+
+import networkx as nx
+import numpy as np
+import torch
+from torch_geometric.utils.convert import from_networkx
+
+
+# Make the networkx graph
+G = nx.Graph()
+
+# Add some cars (just do 4 for now)
+G.add_nodes_from([
+ (1, {'y': 1, 'x': 0.5}),
+ (2, {'y': 2, 'x': 0.2}),
+ (3, {'y': 3, 'x': 0.3}),
+ (4, {'y': 4, 'x': 0.1}),
+ (5, {'y': 5, 'x': 0.2}),
+])
+
+# Add some edges
+G.add_edges_from([
+ (1, 2), (1, 4), (1, 5),
+ (2, 3), (2, 4),
+ (3, 2), (3, 5),
+ (4, 1), (4, 2),
+ (5, 1), (5, 3)
+])
+
+# Convert the graph into PyTorch geometric
+pyg_graph = from_networkx(G)
+
+print(pyg_graph)
+# Data(edge_index=[2, 12], x=[5], y=[5])
+print(pyg_graph.x)
+# tensor([0.5000, 0.2000, 0.3000, 0.1000, 0.2000])
+print(pyg_graph.y)
+# tensor([1, 2, 3, 4, 5])
+print(pyg_graph.edge_index)
+# tensor([[0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 4],
+# [1, 3, 4, 0, 2, 3, 1, 4, 0, 1, 0, 2]])
+
+
+# Split the data
+train_ratio = 0.2
+num_nodes = pyg_graph.x.shape[0]
+num_train = int(num_nodes * train_ratio)
+idx = [i for i in range(num_nodes)]
+
+np.random.shuffle(idx)
+train_mask = torch.full_like(pyg_graph.y, False, dtype=bool)
+train_mask[idx[:num_train]] = True
+test_mask = torch.full_like(pyg_graph.y, False, dtype=bool)
+test_mask[idx[num_train:]] = True
+
+print(train_mask)
+# tensor([ True, False, False, False, False])
+print(test_mask)
+# tensor([False, True, True, True, True])
+
",python
+"Best way to remove specific words from column in pandas dataframe?I'm working with a huge set of data that I can't work with in excel so I'm using Pandas/Python, but I'm relatively new to it. I have this column of book titles that also include genres, both before and after the title. I only want the column to contain book titles, so what would be the easiest way to remove the genres?
+Here is an example of what the column contains:
+Book Labels
+Science Fiction | Drama | Dune
+Thriller | Mystery | The Day I Died
+Thriller | Razorblade Tears | Family | Drama
+Comedy | How To Marry Keanu Reeves In 90 Days | Drama
+...
+
+So above, the book titles would be Dune, The Day I Died, Razorblade Tears, and How To Marry Keanu Reeves In 90 Days, but as you can see the genres precede as well as succeed the titles.
+I was thinking I could create a list of all the genres (as there are only so many) and remove those from the column along with the "|" characters, but if anyone has suggestions on a simpler way to remove the genres and "|" key, please help me out.
","It is an enhancement to @tdy Regex solution. The original regex Family|Drama will match the words "Family" and "Drama" in the string. If the book title contains the words in gernes, the words will be removed as well.
+Supposed that the labels are separated by " | ", there are three match conditions we want to remove.
+
+- Gerne at start of string. e.g.
Drama | ...
+- Gerne in the middle. e.g.
... | Drama | ...
+- Gerne at end of string. e.g.
... | Drama
+
+Use regex (^|\| )(?:Family|Drama)(?=( \||$)) to match one of three conditions. Note that | Drama | Family has 2 overlapped matches, here I use ?=( \||$) to avoid matching once only. See this problem [Use regular expressions to replace overlapping subpatterns] for more details.
+>>> genres = ["Family", "Drama"]
+
+>>> df
+
+# Book Labels
+# 0 Drama | Drama 123 | Family
+# 1 Drama 123 | Drama | Family
+# 2 Drama | Family | Drama 123
+# 3 123 Drama 123 | Family | Drama
+# 4 Drama | Family | 123 Drama
+
+>>> re_str = "(^|\| )(?:{})(?=( \||$))".format("|".join(genres))
+
+>>> df['Book Labels'] = df['Book Labels'].str.replace(re_str, "", regex=True)
+
+# 0 | Drama 123
+# 1 Drama 123
+# 2 | Drama 123
+# 3 123 Drama 123
+# 4 | 123 Drama
+
+>>> df["Book Labels"] = df["Book Labels"].str.strip("| ")
+
+# 0 Drama 123
+# 1 Drama 123
+# 2 Drama 123
+# 3 123 Drama 123
+# 4 123 Drama
+
",python
+"RabbitMQ and Celery: subscribe to job done eventI have a simple Celery task.py running with RabbitMQ message broker and Redis data storage
+from celery import Celery
+
+app = Celery('tasks', broker='pyamqp://guest@localhost//', backend="redis://localhost:6379/0")
+
+@app.task
+def add(x, y):
+ return x + y
+
+and a listener.py service, with a trivial function
+def on_add(result):
+ # Do something with the result.
+
+I want to invoke add() with a fire-and-forget style, and let another service implementing on_add() handle results.
+This is the diagram of the workflow:
+![]()
+How can I create a listener that subscribes to task completion events on Celery's backend, Redis?
","You have at least two options here:
+
+- Use signals - task-postrun for instance:
+
+@task_postrun.connect
+def task_postrun_handler(task_id, task, args, retval, **kwargs):
+ if task.name == "add":
+ on_add(retval)
+
+note that it will run in the same celery worker.
+
+- If you need it in a separate process, you can take flower's approach and listen to the broker's events (more complicated).
+
",python
+"Receiving an ""PRECONDITION_FAILED - invalid property 'auto-delete' for queue 'test.queue' in vhost '/'"" error message for PikaI am attempting to declare a Queue on my RabbitMQ broker. I had no issue before this until I tried adding "auto_delete=True" parameter to the queue declaration. The Queue is meant to consume messsages from a fanout Exchange.
+I have already deleted the Queue before declaring it again. I even tried renaming the Queue. Nevertheless, I keep receiving the same error "PRECONDITION_FAILED - invalid property 'auto-delete' for queue 'test.queue' in vhost '/'". Below is my code for declaring the queue:
+def setup_queue(self, queue_name):
+ LOGGER.info('Declaring queue %s', queue_name)
+ cb = functools.partial(self.on_queue_declareok, userdata=queue_name)
+ self._channel.queue_declare(
+ queue=queue_name,
+ callback=cb,
+ durable=True,
+ auto_delete=True,
+ arguments={"x-queue-type": "quorum", "x-max-length": MAX_QUEUE_LENGTH})
+
","you can't set auto_delete=True feature for Quorum queues.
+See the features for more details:
+https://www.rabbitmq.com/quorum-queues.html
",python
+"A way to blit multiple buttons from a list with working alpha and color?I wanted to blit several button choices depending on the number that a list gives, but putting them in a for loop makes the alpha() and fill() function stop working. Is there a way to fix this or there's a better alternative to code multiple buttons?
+Starting code:
+import pygame, sys
+pygame.init()
+
+WIDTH, HEIGHT = 900, 600
+screen = pygame.display.set_mode((WIDTH,HEIGHT),0,32)
+clock = pygame.time.Clock()
+font = pygame.font.Font('freesansbold.ttf', 32)
+
+Button and Scene class:
+class Button():
+ def __init__(self, text, x, y):
+ self.rect = pygame.Rect(x, y, 0, 0)
+ self.updateText(text)
+ self.clicked = False
+
+ def updateText(self, text):
+ self.text = text
+ self.render = font.render(self.text, True, 'white')
+ self.text_width = self.render.get_width()
+ self.text_height = self.render.get_height()
+ self.box = pygame.Surface((self.text_width, self.text_height))
+ self.rect = self.render.get_rect(topleft = self.rect.topleft)
+
+ def draw(self):
+ action = False
+ screen.blit(self.box, (self.rect.x, self.rect.y))
+ screen.blit(self.render, (self.rect.x, self.rect.y))
+
+ pos = pygame.mouse.get_pos()
+ if self.rect.collidepoint(pos):
+ self.box.set_alpha(100)
+ self.box.fill((255, 255, 255))
+
+ if pygame.mouse.get_pressed()[0] == 1 and self.clicked == False:
+ self.clicked = True
+ action = True
+
+ if pygame.mouse.get_pressed()[0] == 0:
+ action = False
+ self.clicked = False
+ else:
+ self.box.set_alpha(0)
+
+ return action
+
+
+class Scene():
+ def __init__(self):
+ pass
+
+ def on_start(self):
+ self.count = 0
+ self.blitcount = 0
+ self.optionList = []
+
+ for button in range(5):
+ self.optionList.append(Button("button", WIDTH/3*2, 60 *(button + 1)))
+ self.count += 1
+
+ self.altButton = Button("Button without for loop", 100, 100)
+ self.buttons = None
+
+ def update(self, events):
+ screen.fill('gray')
+
+ for i in range(4):
+
+ self.buttons = self.optionList[self.blitcount]
+ self.buttons.updateText(str(i))
+
+ if self.buttons.draw():
+ print(i)
+
+ self.blitcount += 1
+
+ if self.altButton.draw():
+ print("Alt")
+
+ self.blitcount = 0
+
+ return self
+
+The rest of the code:
+game = Scene()
+game.on_start()
+
+while True:
+ clock.tick(30)
+
+ events = pygame.event.get()
+ for event in events:
+ if event.type == pygame.QUIT:
+ pygame.quit()
+ sys.exit()
+
+ game.update(events)
+
+ pygame.display.update()
+
","You have to change the order. You need to set the alpha channel of the box before drawing the box. Note that the box is drawn with the currently set alpha channel.
+(See also the answer to your previous question How do I change the text in a Button class for Pygame?)
+class Button():
+ # [...]
+
+ def draw(self):
+ action = False
+
+ pos = pygame.mouse.get_pos()
+ if self.rect.collidepoint(pos):
+ self.box.set_alpha(100)
+ self.box.fill((255, 255, 255))
+
+ if pygame.mouse.get_pressed()[0] == 1 and self.clicked == False:
+ self.clicked = True
+ action = True
+
+ if pygame.mouse.get_pressed()[0] == 0:
+ action = False
+ self.clicked = False
+ else:
+ self.box.set_alpha(0)
+
+ screen.blit(self.box, (self.rect.x, self.rect.y))
+ screen.blit(self.render, (self.rect.x, self.rect.y))
+
+ return action
+
",python
+"How can I pick specific ranged value from np array?My code is this.
+a = cv2.imread('img_directory',0)/255
+a0 = np.zeros_like(a)
+a0[a<0.9] = 1.0
+a0[a>0.5] = 0.0
+
+Here, I want to take values of a that is larger than 0.5, and smaller than 0.9.
+But I see that this code does not work properly.
+First I tried
+a=[a<0.9 and a>0.5] =1.0
+
+and this did not work. How should I make a code to fulfill my task ?
","If you want to logical and of two numpy array, use np.logical_and(), you can refer to document for more information.
+Therefore, a=[np.logical_and((a<0.9), (a>0.5))] = 1.0 will work.
",python
+"How can I put labels in two charts using matplotlibI'm trying to plot two histogram using the result of a group by. But the labels just appear in one of the labels.
+How can I put the label in both charts?
+And how can I put different title for the charts (e.g. first as Men's grade and Second as Woman's grade)
+import pandas as pd
+import matplotlib.pyplot as plt
+
+microdataEnem = pd.read_csv('C:\\Users\\Lucas\\AppData\\Local\\Programs\\Python\\Python39\\Scripts\\Data Science\\Data Analysis\\Projects\\ENEM\\DADOS\\MICRODADOS_ENEM_2019.csv', sep = ';', encoding = 'ISO-8859-1', nrows=10000)
+
+sex_essaygrade = ['TP_SEXO', 'NU_NOTA_REDACAO']
+
+filter_sex_essaygrade = microdataEnem.filter(items = sex_essaygrade)
+
+filter_sex_essaygrade.dropna(subset = ['NU_NOTA_REDACAO'], inplace = True)
+
+filter_sex_essaygrade.groupby('TP_SEXO').hist()
+plt.xlabel('Grade')
+plt.ylabel('Number of students')
+
+
+plt.show()
+
+![]()
","Instead of using filter_sex_essaygrade.groupby('TP_SEXO').hist() you can try the following format: axs = filter_sex_essaygrade['NU_NOTA_REDACAO'].hist(by=filter_sex_essaygrade['TP_SEXO']). This will automatically title each histogram with the group name.
+You'll want to set an the variable axs equal to this histogram object so that you can modify the x and y labels for both plots.
+I created some data similar to yours, and I get the following result:
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+
+np.random.seed(42)
+
+sex_essaygrade = ['TP_SEXO', 'NU_NOTA_REDACAO']
+
+## create two distinct sets of grades
+sample_grades = np.concatenate((np.random.randint(low=70,high=100,size=100), np.random.randint(low=80,high=100,size=100)))
+
+filter_sex_essaygrade = pd.DataFrame({
+ 'NU_NOTA_REDACAO': sample_grades,
+ 'TP_SEXO': ['Men']*100 + ['Women']*100
+})
+
+axs = filter_sex_essaygrade['NU_NOTA_REDACAO'].hist(by=filter_sex_essaygrade['TP_SEXO'])
+for ax in axs.flatten():
+ ax.set_xlabel("Grade")
+ ax.set_ylabel("Number of students")
+
+plt.show()
+
+![]()
",python
+"Not able to connect to AWS Aurora DB server-less -- mysql engineI have created an AWS Aurora DB serverless publically available, and I am trying to connect to the DB using python. But I am unable to connect and I suspect the VPC.
+Please suggest if I have to check anything else, also I have these below queries;
+
+- Is AWS Aurora serverless with min configuration a free tire DB?
+- My VPC while creating the DB is Public already, yet I am not able to connect so do I need to perform any additional configuration changes?
+
+Code Snippets:
+import mysql.connector as mysq
+import sys, os, boto3 as aws, pandas as pd, pymysql
+from sqlalchemy import create_engine, inspect
+
+ENDPOINT = "random.cluster-random.ap-south-1.rds.amazonaws.com"
+PORT = "3306"
+USER = "random"
+REGION = "ap-south-1"
+DBNAME = "random"
+PASSWORD = "random"
+
+
+Method 1:
+client = aws.client('rds')
+token = client.generate_db_auth_token(DBHostname=ENDPOINT, Port=PORT, DBUsername=USER, Region=REGION)
+print(token)
+
+try:
+ conn = mysq.connect(host=ENDPOINT, user=USER, passwd=token, port=PORT, database=DBNAME)
+
+ cur = conn.cursor()
+ cur.execute("""SELECT now()""")
+ query_results = cur.fetchall()
+ print(query_results)
+except Exception as e:
+ print("Database connection failed due to {}".format(e))
+
+Method 2:
+CONNECTION_STRING = 'mssql+pymssql' + '://' + USER + ':' + PASSWORD + '@' + ENDPOINT + ':' + str(PORT) + '/' + DBNAME
+engine = create_engine(CONNECTION_STRING)
+print(inspect(engine).get_table_names())
+
+Method 3:
+conn = pymysql.connect(host=ENDPOINT, user=USER,port=int(PORT), passwd=PASSWORD, db=DBNAME)
+
+Thanks,
+Nikhil
","Aurora serverless does not have public ip. Form docs.
+
+You can't give an Aurora Serverless v1 DB cluster a public IP address. You can access an Aurora Serverless v1 DB cluster only from within a VPC.
+
+Same for Aurora v2.
+So you have to setup a VPN between your home/work network and your VPC, or use SSH tunneling through some ec2 instance bastion host.
",python
+"Script fails to produce descriptions from a webpage when hardcoded delay is not in placeI'm trying to fetch titles and descriptions of results from a webpage. The descriptions are revealed when the titles are clicked. The script below works only when I define a hardcoded delay after the click. However, I wish to get rid of the hardcoded delay from the script.
+import time
+from selenium import webdriver
+from selenium.webdriver.common.by import By
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.support import expected_conditions as EC
+
+link = 'https://innovation.ised-isde.canada.ca/s/list-liste?language=en_CA&token=a0B5W000000WsFSUA0'
+
+driver = webdriver.Chrome()
+wait = WebDriverWait(driver,10)
+driver.get(link)
+
+for i,item in enumerate(wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,".advanced-results .h4")))):
+ driver.execute_script("arguments[0].click();",item)
+ title = item.text
+ time.sleep(2)
+ desc = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"p.program-dov-description")))[i].text
+ print(title,desc)
+ print()
+
+driver.quit()
+
+
+How to fetch titles and descriptions from that webpage without using hardcoded delay?
+
","This should work. The thing is I couldn't figure out how to get multiple elements with css_selector, so I trusted my old friend, the xpath, and so I fetched the result, without using time.sleep as required. I get an error with this line of yours, stating that the element is not subscriptable because of [i], but my xpath works.
+desc = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"p.program-dov-description")))[i].text
+ print(title,desc)
+
+Here is the refactored code:
+import time
+from selenium import webdriver
+from selenium.webdriver.common.by import By
+from selenium.webdriver.support.ui import WebDriverWait
+from selenium.webdriver.support import expected_conditions as EC
+from webdriver_manager.chrome import ChromeDriverManager
+
+link = 'https://innovation.ised-isde.canada.ca/s/list-liste?language=en_CA&token=a0B5W000000WsFSUA0'
+
+driver = webdriver.Chrome(ChromeDriverManager().install())
+wait = WebDriverWait(driver,10)
+driver.get(link)
+
+for i,item in enumerate(wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,".advanced-results .h4")))):
+ # print(i, item)
+ driver.execute_script("arguments[0].click();",item)
+ title = item.text
+ desc_text = "//*[@class='program-dov-description']"
+ desc = wait.until(EC.visibility_of_element_located((By.XPATH,'(' + desc_text + ')' + '[' + str(i+1) + ']' ))).text
+ print(title,desc)
+ print()
+
+driver.quit()
+
+Output (pasting partial output to save space)
+Income support for sick or self-isolating workers Get up to $500 per week for six weeks if you are employed or self-employed and need to take time off because you’re sick or need to self-isolate due to COVID-19 or have an underlying medical condition that puts you at higher risks of getting COVID-19.
+
+Income support to care for dependent family members Get up to $500 per week for up to 44 weeks, per household, if you haven’t been able to work for at least 50% of your normal work week because you are caring for children under 12 years old or other dependent family members because of COVID-19.
+
+Support for businesses to avoid layoffs If your employees are eligible for Employment Insurance (EI) benefits, and your business is suffering a downturn due to COVID-19, you can apply for a work-sharing agreement. The agreement would allow your employees to work a temporarily reduced work week while receiving EI benefits.
+
",python
+"How should I filter one dataframe by entries from another one in pandas with isin?I have two dataframes (df1, df2). The columns names and indices are the same (the difference in columns entries). Also, df2 has only 20 entries (which also existed in df1 as i said).
+I want to filter df1 by df2 entries, but when i try to do it with isin but nothing happens.
+df1.isin(df2) or df1.index.isin(df2.index)
+
+Tell me please what I'm doing wrong and how should I do it..
","If you want to select the entries in df1 with an index that is also present in df2, you should be able to do it with:
+df1.loc[df2.index]
+
+or if you really want to use isin:
+df1[df1.index.isin(df2.index)]
+
",python
+"how do I manage deletion with multiple foreign keys to the same table in Django?Say I have a model with a Things table and a table of relationships between the things called ThingRelations. It should not be possible to delete a Thing when there are ThingRelations that point to it. This is how I'm trying to implement that:
+from django.db import models
+
+class ThingRelation(models.Model):
+ first_thing = models.ForeignKey('Thing', on_delete=models.PROTECT)
+ second_thing = models.ForeignKey('Thing', on_delete=models.PROTECT)
+
+class Thing(models.Model):
+ name = CharField(max_length=260)
+
+
+How do I automatically delete a Thing when there are no more ThingRelations pointing to it?
","You have such options:
+
+- A routine. It can be made as a
Command and something like crontab. Or it could be designed as a periodiq routine. This way you can repeatedly select all Thing models which have no relations with ThingRelation.
+signals.py action. This way when entry of ThingRelation is deleted, you should check both the first_thing and second_thing in order to know whether the have any more ThingRelation relations pointing to them.
+- DB trigger (e.g. for PostgreSQL). The same idea as
signal.py solution, but on DB level.
+
+Which one should you choose? Depends on details of your exact objective. As for me, I use periodiq option on simple cases and DB trigger if I aim on high performance.
",python
+"Seperate folder for django modelsI am working on Django project that will utilize different apps to fulfill certain task. Since these apps will be referring to much same data to complete these task I figure it makes since to create a separate folder with the models like this:
+--Project
+ --App1
+ --App2
+ --models
+ ---model1.py
+ ---model2.py
+
+Right now I'm having trouble with Django recognizing the models as existing, every time I run a makemigrations Django does not detect that any changes have been made
+I attempted to put a __init__.py file in the /models folder but this doesn't seem to do anything.
","You should not seperate models in django projects dude! models.py file must be in app folder, That's why you can not migrate.
",python
+"TypeError: 'str' object is not callable matplotlibI encountered with this error: TypeError: 'str' object is not callable when trying to add a label on Y axis for my bar chart. Why? please help. Here's the code:
+import pandas as pd
+import matplotlib.pyplot as plt
+# TODO: Plot company with the most stock and the lowest stock
+data = pd.read_csv("c:\\users\\HP\\Downloads\\Stock.csv")
+
+plt.bar(1,data['Stock'].max(), label=data['Company'].max())
+plt.bar(1,data['Stock'].min(), label=data['Company'].min())
+plt.ylabel("Stock")
+plt.legend()
+
+Full error:
+ TypeError Traceback (most recent call last)
+<ipython-input-32-136cc5c2aaf2> in <module>
+ 5 plt.bar(1,data['Stock'].max(), label=data['Company'].max())
+ 6 plt.bar(1,data['Stock'].min(), label=data['Company'].min())
+----> 7 plt.ylabel("Stock")
+ 8 plt.legend()
+
+TypeError: 'str' object is not callable
+
",Try restarting the Kernel. This sometimes help
,python
+"operands could not be broadcast together with shapes (100,3) (100,) , why?This is my first question in stackoverlow, and My English is really poor, so I'm grateful to all those who read my poor English and help me^_^
+My question is about broadcasting. enter image description here
+What I want to do is mutiply each row of X by the number in the same row of B……
+X is a (100,3) array and XW is a column vector, (100,). Why They can't broadcast?
+After I add "XW = XW.reshape((X.shape[0],1))", Then, they can broadcast. Why…… Are there any difference between (100,1) and (100,)?
+I think my picture have clearly described my question...My code really long.... I think it's not convenient to watch my code...
+Here is the code..
+import numpy as np
+import matplotlib.pyplot as plt
+
+class MyFirstMachineLeaningAlgorithm():
+ def StochasticGradientDescent(self, W, X, count=100, a=0.1):
+
+ n = X.shape[0]
+ for i in range(count): # 学习count次
+ gradient = np.zeros(3)
+ for j in range(n):
+ gradient += X[j, :] * (1 - 2 * (X[j, :] @ W))
+
+ W = W + a * gradient
+ # 修复模长
+ W = W / np.sqrt((W @ W))
+
+ return W
+
+ def BatchGraidentDescent(self, W, X, count=100, a=0.1):
+ for i in range(count):
+ XW = X @ W
+ XW = 1 - 2 * XW
+
+ #XW = XW.reshape((X.shape[0],1))
+ gradient = X*XW
+ gradient = np.sum(gradient,axis = 0)
+
+ W = W + a * gradient
+ # 修复模长
+ W = W / np.sqrt((W @ W))
+
+ def train(self, count=100):
+ self.W = self.BatchGraidentDescent(self.W, self.X, count)
+
+ def draw(self):
+ draw_x = np.arange(-120, 120, 0.01)
+ draw_y = -self.W[0] / self.W[1] * draw_x
+ draw_y = [-self.W[2] / self.W[1] + draw_y[i] for i in range(len(draw_y))]
+ plt.plot(draw_x, draw_y)
+ plt.show()
+
+ def __init__(self):
+ array_size = (50, 2)
+ array1 = np.random.randint(50, 100, size=array_size)
+ array2 = np.random.randint(-100, -50, size=array_size)
+ array = np.vstack((array1, array2))
+ column = np.ones(100)
+ self.X = np.column_stack((array, column))
+ plt.scatter(array[:, 0], array[0:, 1])
+ self.W = np.array([1, 2, 3])
+ self.W = self.W / np.sqrt((self.W @ self.W))
+
+g = MyFirstMachineLeaningAlgorithm()
+g.train()
+g.draw()
+
+
","It's best to post error information with copy-n-paste, not an image. Still the image is better than nothing.
+So the error occurs in the last line of this clip:
+ XW = X @ W
+ XW = 1 - 2 * XW
+
+ #XW = XW.reshape((X.shape[0],1))
+ gradient = X*XW
+
+Just from the function definition I can't tell the shape of X and W. Apparently X is 2d (100,n). If W is (n,), then XW will be (100,), with the sum-of-products on the n dimension. Read the np.matmul docs if that isn't clear.
+By the rules of broadcasting (look them up), if one array doesn't have as many dimensions as the other, it will add leading dimensions as needed. Thus (100,) can become (1,100). But to avoid ambiguity, it will not add a trailing dimension. You have to provide that yourself. So the last line should become
+ gradient = X * XW[:,None]
+
+or the equivalent using XW.reshape(-1,1) or your version.
+Because arrays can be 1d (or even 0d), terms like row vector or column vector have limited value. A 1d array can thought of as a row vector in some cases - where this auto-leading dimension applies.
+
+In init,
+ self.X = np.column_stack((array, column))
+ self.W = np.array([1, 2, 3])
+
+X is (100,3) and W is (3,). X@W is then (100,).
+In [45]: X=np.ones((100,3)); W=np.array([1,2,3])
+In [46]: (X@W).shape
+Out[46]: (100,)
+In [47]: X * (1+(X@W)[:,None]);
+
",python
+"change line color in middle of plot - plotlyCreating 12 subplots with plotly from a csv containing rows with 13 values. The last value indicates that the data in that row is estimated from this point until the status changes back in a later row.
+Trying to make line graphs that plots a line, changes to red when the status changes to 1, then back to original color when the status changes back to 0. Is this possible?
+with lock:
+ df = pd.read_csv(OCcsvFile, delimiter=',')
+
+# plotly setup
+plot_rows = 4
+plot_cols = 3
+# Create plot figure
+fig = make_subplots(rows=plot_rows, cols=plot_cols, subplot_titles=("Header1", "Header2", "Header3", "Header4", "Header5",
+ "Header6", "Header7", "Header8", "Header9", "Header10",
+ "Header11", "Header12"))
+
+# add traces
+x = 1 # column counter
+for i in range(1, plot_rows+1):
+ for j in range(1, plot_cols+1):
+ #print(str(i)+ ', ' + str(j))
+ fig.add_trace(go.Scatter(x=df.iloc[:, 0], y=df.iloc[:, x],
+ name=df.columns[x],
+ mode='lines'),
+ row=i,
+ col=j)
+ x = x+1
+
","
+- taken approach of reshaping dataframe to be ready for plotly express
+- have worked out starting dataframe from description, a sample would be better
+
+import pandas as pd
+import plotly.express as px
+
+df = pd.DataFrame(
+ np.random.randint(1, 10, [100, 12]), columns=[f"c{i+1}" for i in range(12)]
+).assign(status=np.repeat(np.random.randint(0, 2, 20), 5))
+
+# restructure dataframe for px
+# 1. preserve status in index
+# 2. make columns another level of index
+# 3. make index columns and make column names meaningful
+dfp = (
+ df.set_index("status", append=True)
+ .stack()
+ .reset_index()
+ .rename(columns={"level_0": "x", "level_2": "facet", 0: "value"})
+)
+
+# make sure missing values are present as NaN
+dfp = dfp.merge(
+ pd.DataFrame(index=pd.MultiIndex.from_product(
+ [dfp["x"].unique(), dfp["facet"].unique(), dfp["status"].unique()]
+ )),
+ left_on=["x", "facet", "status"],
+ right_index=True,
+ how="right"
+)
+
+# now it's a very simple plot
+px.line(dfp, x="x", y="value", color="status", facet_col="facet", facet_col_wrap=4)
+
+
+
+![]()
+expected structure or df
+
+- 13 columns, last column indicating the status
+
+
+
+
+
+ |
+c1 |
+c2 |
+c3 |
+c4 |
+c5 |
+c6 |
+c7 |
+c8 |
+c9 |
+c10 |
+c11 |
+c12 |
+status |
+
+
+
+
+| 10 |
+7 |
+3 |
+1 |
+7 |
+2 |
+8 |
+1 |
+3 |
+9 |
+6 |
+3 |
+8 |
+1 |
+
+
+| 11 |
+4 |
+5 |
+8 |
+9 |
+5 |
+4 |
+3 |
+6 |
+3 |
+7 |
+4 |
+8 |
+1 |
+
+
+| 12 |
+6 |
+3 |
+2 |
+6 |
+5 |
+6 |
+4 |
+3 |
+5 |
+3 |
+9 |
+7 |
+1 |
+
+
+| 13 |
+4 |
+2 |
+4 |
+8 |
+6 |
+3 |
+3 |
+5 |
+8 |
+8 |
+1 |
+4 |
+1 |
+
+
+| 14 |
+4 |
+9 |
+9 |
+3 |
+1 |
+8 |
+2 |
+5 |
+1 |
+5 |
+1 |
+4 |
+1 |
+
+
+| 15 |
+4 |
+9 |
+6 |
+2 |
+9 |
+4 |
+1 |
+6 |
+6 |
+1 |
+6 |
+1 |
+0 |
+
+
+| 16 |
+8 |
+5 |
+9 |
+7 |
+7 |
+3 |
+1 |
+1 |
+2 |
+5 |
+2 |
+9 |
+0 |
+
+
+| 17 |
+6 |
+1 |
+4 |
+2 |
+8 |
+5 |
+9 |
+8 |
+2 |
+4 |
+8 |
+4 |
+0 |
+
+
+| 18 |
+1 |
+6 |
+1 |
+3 |
+8 |
+5 |
+5 |
+9 |
+8 |
+9 |
+2 |
+9 |
+0 |
+
+
+| 19 |
+1 |
+4 |
+1 |
+1 |
+7 |
+8 |
+2 |
+3 |
+5 |
+6 |
+6 |
+4 |
+0 |
+
+
+
+
",python
+"Discord.py How do I split a bots message?So, I have a discord information bot I am working on and this is one of the commands.
+@client.command()
+async def rd(ctx):
+ embed = discord.Embed(title="**[R&D Cost at every level]:**")
+ embed.add_field(name="**Level//All Units ATK Bonus//March Size Increase//R&D Cost**", value= "|", inline=False)
+ embed.add_field(name="6 0.6% 0 10", value='.', inline=False)
+ embed.add_field(name="7 0.7% 0 20", value='.', inline=False)
+ embed.add_field(name="8 0.8% 0 30", value='.', inline=False)
+ embed.add_field(name="9 0.9% 0 35", value='.', inline=False)
+ embed.add_field(name="10 1.0% 0 40", value='.', inline=False)
+ embed.add_field(name="11 1.2% 0 45", value='.', inline=False)
+ embed.add_field(name="12 1.5% 0 50", value='.', inline=False)
+ embed.add_field(name="13 2.0% 0 60", value='.', inline=False)
+ embed.add_field(name="14 2.5% 0 TBD", value='.', inline=False)
+ embed.add_field(name="15 3.0% 0 TBD", value='.', inline=False)
+ embed.add_field(name="16 3.5% 0 TBD", value='.', inline=False)
+ embed.add_field(name="17 4.0% 0 TBD", value='.', inline=False)
+ embed.add_field(name="18 4.5% 0 TBD", value='.', inline=False)
+ embed.add_field(name="19 5.0% 0 TBD", value='.', inline=False)
+ embed.add_field(name="20 5.5% 0 TBD", value='.', inline=False)
+ embed.add_field(name="21 6.0% 0 TBD", value='.', inline=False)
+ embed.add_field(name="22 6.5% 0 TBD", value='.', inline=False)
+ embed.add_field(name="23 7.0% 0 TBD", value='.', inline=False)
+ embed.add_field(name="24 7.5% 0 TBD", value='.', inline=False)
+ embed.add_field(name="25 8.0% 1 TBD", value='.', inline=False)
+ embed.add_field(name="26 8.5% 1 TBD", value='.', inline=False)
+ embed.add_field(name="27 9.0% 1 TBD", value='.', inline=False)
+ embed.add_field(name="28 9.5% 1 TBD", value='.', inline=False)
+ embed.add_field(name="29 10.0% 1 TBD", value='.', inline=False)
+ embed.add_field(name="30 10.5% 2 TBD", value='.', inline=False)
+ embed.add_field(name="31 11.0% 2 TBD", value='.', inline=False)
+ embed.add_field(name="32 11.5% 2 TBD", value='.', inline=False)
+ embed.add_field(name="33 12.0% 2 1.27K", value='.', inline=False)
+ embed.add_field(name="34 12.5% 2 1.46K", value='.', inline=False)
+ embed.add_field(name="35 13.0% 3 1.87K", value='.', inline=False)
+ embed.add_field(name="36 13.5% 3 2.02K", value='.', inline=False)
+ embed.add_field(name="37 14.0% 3 2.18K", value='.', inline=False)
+ embed.add_field(name="38 14.5% 3 2.36K", value='.', inline=False)
+ embed.add_field(name="39 15.0% 3 2.54K", value='.', inline=False)
+ embed.add_field(name="40 15.5% 4 2.73K", value='.', inline=False)
+ embed.add_field(name="41 16.0% 4 2.87K", value='.', inline=False)
+ embed.add_field(name="42 16.5% 4 3.07K", value='.', inline=False)
+ embed.add_field(name="43 17.0% 4 3.29K", value='.', inline=False)
+ embed.add_field(name="44 17.5% 4 3.51K", value='.', inline=False)
+ embed.add_field(name="45 18.0% 5 3.74K", value='.', inline=False)
+ embed.add_field(name="46 18.5% 5 3.99K", value='.', inline=False)
+ embed.add_field(name="47 19.0% 5 4.09K", value='.', inline=False)
+ embed.add_field(name="48 19.5% 5 4.34K", value='.', inline=False)
+ embed.add_field(name="49 20.0% 5 4.44K", value='.', inline=False)
+ embed.add_field(name="50 20.5% 6 4.54K", value='.', inline=False)
+
+The command works great, except that discord cuts the message off on "embed.add_field(name="29 10.0% 1 TBD", value='.', inline=False)", due to the max character limit in a single message.
+![]()
+How would I go about splitting this message into pages? I have seen where you can have emojis that scroll through information, but I am unsure on how to apply that to this command here.
+Or how could I make it to where it posts into multiple messages instead of it attempting to post into one whole message?
+Any help is much appreciated!
","This is the code I used to work with pagination before I started implementing buttons
+async def paginate(
+ ctx: discord.ext.commands.context.Context,
+ *embed_pages: typing.Union[discord.Embed, list[discord.Embed]],
+ content=None,
+ overwrite_footer=True,
+ timeout=None,
+):
+
+ if isinstance(embed_pages[0], list):
+ embed_pages = embed_pages[0]
+
+ every_embed = list()
+ if overwrite_footer:
+ for index, each_embed in enumerate(embed_pages):
+ each_embed.remove_footer()
+ each_embed.set_footer(text=f"Page {index + 1} of {len(embed_pages)}")
+ every_embed.append(each_embed)
+ else:
+ every_embed = embed_pages[:]
+
+ sent_embed = await ctx.send(
+ content=content, embed=every_embed[0]
+ ) # Send the first page
+ page_index = 0 # Set the starting index
+
+ reactions = ["⬅️", "", "➡️"]
+ for each_reaction in reactions:
+ await sent_embed.add_reaction(each_reaction)
+
+ while True:
+ try:
+ payload = await bot.wait_for(
+ "raw_reaction_add",
+ check=lambda payload: payload.message_id == sent_embed.id
+ and not payload.member == bot.user
+ and payload.member.id
+ == ctx.author.id, # Do this if you want it to be author-only
+ timeout=timeout,
+ )
+ except asyncio.TimeoutError:
+ # timeout has been hit
+ if overwrite_footer:
+ every_embed[page_index].remove_footer()
+ every_embed[page_index].set_footer(
+ text="Pagination traversal has timed out."
+ )
+ await sent_embed.edit(embed=every_embed[page_index])
+ try:
+ await every_embed.clear_reactions()
+ except:
+ pass
+ return
+ else:
+ try:
+ await sent_embed.remove_reaction(payload.emoji, payload.member)
+ except discord.Forbidden:
+ # Bot does not have permission
+ pass
+
+ if (
+ str(payload.emoji.name) not in reactions
+ ): # Some user reacted with something else
+ pass
+
+ elif str(payload.emoji.name) == reactions[0]: # Previous page
+ if not page_index == 0:
+ page_index -= 1
+ else:
+ page_index = len(every_embed) - 1
+ await sent_embed.edit(embed=every_embed[page_index], content=content)
+
+ elif str(payload.emoji.name) == reactions[1]: # Goto page 0
+ page_index = 0
+ await sent_embed.edit(embed=every_embed[page_index], content=content)
+
+ else:
+ if not page_index == (len(every_embed) - 1):
+ page_index += 1
+ else:
+ page_index = 0
+ await sent_embed.edit(embed=every_embed[page_index], content=content)
+
+
+This function is a coroutine and has to be awaited, the first argument must be ctx (context), rest of the arguments may be discord.Embed objects or a single list that contain instances of aforementioned discord.Embed.
+The rest of the arguments are optional and self-explanatory.
+Here's an use-case :
+import discord
+import os
+import asyncio
+import typing
+import Lorem # Custom module
+from discord.ext import commands
+from dotenv import load_dotenv
+
+load_dotenv()
+bot = commands.Bot(command_prefix=">")
+
+
+async def paginate(
+ ctx: discord.ext.commands.context.Context,
+ *embed_pages: typing.Union[discord.Embed, list[discord.Embed]],
+ content=None,
+ overwrite_footer=True,
+ timeout=None,
+):
+
+ if isinstance(embed_pages[0], list):
+ embed_pages = embed_pages[0]
+
+ every_embed = list()
+ if overwrite_footer:
+ for index, each_embed in enumerate(embed_pages):
+ each_embed.remove_footer()
+ each_embed.set_footer(text=f"Page {index + 1} of {len(embed_pages)}")
+ every_embed.append(each_embed)
+ else:
+ every_embed = embed_pages[:]
+
+ sent_embed = await ctx.send(
+ content=content, embed=every_embed[0]
+ ) # Send the first page
+ page_index = 0 # Set the starting index
+
+ reactions = ["⬅️", "", "➡️"]
+ for each_reaction in reactions:
+ await sent_embed.add_reaction(each_reaction)
+
+ while True:
+ try:
+ payload = await bot.wait_for(
+ "raw_reaction_add",
+ check=lambda payload: payload.message_id == sent_embed.id
+ and not payload.member == bot.user
+ and payload.member.id
+ == ctx.author.id, # Do this if you want it to be author-only
+ timeout=timeout,
+ )
+ except asyncio.TimeoutError:
+ # timeout has been hit
+ if overwrite_footer:
+ every_embed[page_index].remove_footer()
+ every_embed[page_index].set_footer(
+ text="Pagination traversal has timed out."
+ )
+ await sent_embed.edit(embed=every_embed[page_index])
+ try:
+ await every_embed.clear_reactions()
+ except:
+ pass
+ return
+ else:
+ try:
+ await sent_embed.remove_reaction(payload.emoji, payload.member)
+ except discord.Forbidden:
+ # Bot does not have permission
+ pass
+
+ if (
+ str(payload.emoji.name) not in reactions
+ ): # Some user reacted with something else
+ pass
+
+ elif str(payload.emoji.name) == reactions[0]: # Previous page
+ if not page_index == 0:
+ page_index -= 1
+ else:
+ page_index = len(every_embed) - 1
+ await sent_embed.edit(embed=every_embed[page_index], content=content)
+
+ elif str(payload.emoji.name) == reactions[1]: # Goto page 0
+ page_index = 0
+ await sent_embed.edit(embed=every_embed[page_index], content=content)
+
+ else:
+ if not page_index == (len(every_embed) - 1):
+ page_index += 1
+ else:
+ page_index = 0
+ await sent_embed.edit(embed=every_embed[page_index], content=content)
+
+
+@bot.event
+async def on_ready():
+ print(f"Sucessfully logged in as {bot.user}")
+
+
+@bot.command()
+async def start(ctx):
+
+ await paginate(
+ ctx,
+ [
+ discord.Embed(title="Answered by Achxy!", description=Lorem.lorem(6))
+ for _ in range(100)
+ ],
+ )
+ # We just generated 100 pages of Lorem Ipsum :D
+
+
+@bot.command()
+async def ping(ctx):
+ # I made this command just to prove that the while loop earlier isn't blocking.
+ embed = discord.Embed(
+ title="Pong! ",
+ description=f"Current Latency of the bot is {round(bot.latency * 1000)}ms",
+ )
+ await ctx.reply(embed=embed)
+
+
+bot.run(os.getenv("DISCORD_TOKEN"))
+
+This code will bring the following paginated output :
+https://imgur.com/a/tmwGhpc
",python
+"Apply a function to each element of an array in PythonI am trying to do two things in Python:
+
+- Simulate 100 random draws from a Poisson distribution. I have done this by:
+
+sample100 = poisson.rvs(mu=5,size=100)
+
+
+- Take the above sample, and apply an UMP test I've generated to each individual observation (e.g., test the hypothesis against each individual observation). The test should accept the null hypothesis if the observation has a value < 8; reject with probability ~50% if observation has value = 8; reject if observation has value > 8
+
+I cannot figure out how to do the second part of this. The function code I've made is:
+ def optionaltest(y,k,g):
+
+ if (y > k):
+ return 1
+ if (y == k):
+ if rand(uniform(0,1)) >= 0.4885: return 1
+ else: return 0
+ if (y < k):
+ return 0
+
+But there are two issues - apparently if (y==k) is invalid syntax. Second, even if I remove that part, I can't actually apply the function to sample100 since it is an array.
+How can I modify this to make it work? Clearly, I'm very new to Python but I have been scouring the internet for hours. Perhaps I should change how I'm generating my sample data so I can apply a function to it? Maybe there's a way to apply a function to each element of an array? How do I make the test logic work when the output = k (which I will set to 8 in this case)?
+EDIT/UPDATE:
+Here's how I ended up doing it:
+ def optionaltest(y):
+
+ if (y > 8):
+ return 1
+ if (y == 8):
+ if np.random.uniform(0,1) >= 0.4885: return 1
+ else: return 0
+ if (y < 8):
+ return 0
+
+I was able to apply that test to my array data via:
+results_sample100 = list(map(optimaltest, sample100))
+cl.Counter(results_sample100)
+
","This is invalid python syntax
+if rand(uniform(0,1)) >= 0.4885 then 1
+ else 0
+
+Instead, you could do this:
+return 1 if rand(uniform(0,1)) >= 0.4885 else 0
+
+You could also do something more verbose but potentially more straightforward (this is often a matter of taste), like this:
+def optionaltest(y,k,g):
+
+ if (y > k):
+ return 1
+ if (y == k):
+ if rand(uniform(0,1)) >= 0.4885:
+ return 1
+ else:
+ return 0
+ if (y < k):
+ return 0
+
+Or even like this:
+def optionaltest(y,k,g):
+
+ if (y > k):
+ return 1
+ if (y == k) and rand(uniform(0,1)) >= 0.4885:
+ return 1
+ else:
+ return 0
+
+For this question:
+
+Maybe there's a way to apply a function to each element of an array?
+
+You can use a for-loop or map a function over a list:
+results = []
+for elem in somelist:
+ results.append(my_function(elem))
+
+Alternately:
+results = list(map(my_function, somelist))
+
+Your function takes three arguments, though, and it's not clear to me where those are coming from. Is your list a list of tuples?
",python
+"Trying to filter one list according to whether elements are in another list in PythonI am trying to filter a list of genes I have obtained according to whether they are in a reference list. I have looked at these questions, which have been helpful, but they haven't helped me resolve the trouble I'm having (if/else in a list comprehension,
+List comprehension with else pass, if pass and if continue in python, Remove all the elements that occur in one list from another). Some of the answers from the last question in particular seemed very helpful but they didn't seem to work with my data.
+I've tried to simplify what I'm doing, and this is a little toy example I have now:
+head = genes_9.head()
+diff_expressed_tf = [gene for gene in genes_9 if gene in head]
+diff_expressed_tf
+
+# This returns
+[]
+
+I'm thinking that if I can get this to work with "genes_9.head()" it should work with my actual reference data.
+Would someone be able to help me rewrite this to do what I want it to do? Alternatively, if someone could point me towards other relevant questions, I would also appreciate that greatly.
+For reference, here is a little snippet of my data:
+genes_9.head(10)
+
+
+0 Tnfrsf4
+2 Tnfrsf18
+14 Il2ra
+5 Odc1
+7 Foxp3
+36 Ctla4
+3 Ikzf2
+1 Cd5
+8 Ccr8
+24 Tnfrsf9
+
","If your datatype is a Pandas Series, then you can use 'iteritems()' instead of 'iterrows()' like this:
+diff_expressed_tf = [gene for index, gene in genes_9.iteritems() if gene in head]
",python
+"Is there a method to aggregate time series in pandas based on the sequential count of an occurrence?I am looking for a way in pandas to count the number of sequential of occurrences of a particular value in a time series.
+Suppose I am performing an experiment where I flip a coin and get heads or tails (1 or 0). I record my results in a pandas series, and I wish to see how many instances (a count) I had with two sequential heads, three sequential heads, four sequential heads, and so on. Moreover, I wish it to be something of a rolling count, meaning that a sequence of the form (tails, heads, heads, heads, tails) will return a count of two instances of heads occurring in pairs, and a single count of a series of three heads.
+Is there a natural way to do this with methods in a Series/DataFrame? I could do it with some for loops, but I am concerned about the cost of that.
+Thanks.
+Edit: requested input/output.
+Input:
+a = pd.DataFrame({'coin' : [0,1,1,1,0]})
+print(a.summary_of_windows())
+
+Output:
+{1: 3
+ 2: 2,
+ 3: 1}
+
+The output could be a dictionary: the key 1 means heads occurrences, of which three occurred. Key 2 means pairs of sequential heads (there are two of those), and Key 3 means sequences of length 3 of heads (happened once).
","You can use DataFrame.rolling:
+>>> df
+ coin
+0 0
+1 1
+2 1
+3 1
+4 0
+
+# Compute how many sequences of two heads there are:
+>>> df['coin'].rolling(2).sum().eq(2).sum()
+2
+
+# Do it for three sequences:
+# remember to change v AND v
+>>> df['coin'].rolling(3).sum().eq(3).sum()
+1
+
+# Find total number of heads occurences:
+>>> df['coin'].sum()
+3
+
",python
+"How to get the duplicates from the lists based on nameI am accessing some files from the server and printing the results but how can I get the duplicates based on the name from the printed results.
+from datetime import datetime
+class EsriApiMaps:
+
+ def __init__(self, portal, item_type, query):
+ self.item_type = item_type
+ self.query_ = query
+ self.portal = portal
+
+ def query_maps(self):
+ api_query_result = self.portal.content.search(query=self.query_, item_type=self.item_type)
+ l = [] # we will store all the services e,g url,id,owner etc
+
+
+ for l in api_query_result:
+ l_created_time = datetime.fromtimestamp(round(l.created / 1000))
+ l_modified_time = datetime.fromtimestamp(round(l.modified / 1000))
+ df = ("Name: " + l.title + "ID: " + l.id + ", Owner: " + l.owner + ", Created: " + str(l_created_time) + ", Modified: " + str(l_modified_time))
+ print(df)
+
+I have tried this to get the below results,
+Name: KL, ID: af57c454, Owner: Scripter, Created: 2019-10-08 12:57:45, Modified: 2019-10-08 12:57:45
+Name: KL, ID: dfsjd5s4, Owner: d011, Created: 2020-10-27 21:02:54, Modified: 2020-10-27 21:02:54
+Name: TEAM, ID: b8djx8, Owner: j277, Created: 2019-10-08 12:52:54, Modified: 2019-10-08 12:52:54
+Name: ALL, ID: b896sfd, Owner: rp10, Created: 2019-10-11 14:51:38, Modified: 2019-10-11 14:51:38
+Name: MD, ID: dhx865, Owner: ws07, Created: 2019-10-08 15:17:59, Modified: 2019-10-08 15:17:59
+Name: AJKL, ID: dhsa88, Owner: fsdd, Created: 2020-07-23 16:04:20, Modified: 2020-07-23 16:04:20
+Name: MD, ID: sd5425, Owner: fsdd, Created: 2021-02-02 11:43:15, Modified: 2021-02-02 11:43:15
+Name: MD, ID: vcxb65, Owner: dsff1, Created: 2020-06-17 10:56:36, Modified: 2020-06-17 10:56:36
+
+I have tried using,
+names = df.Name.value_counts()
+names[names>1]
+
+But I am getting this error AttributeError: 'str' object has no attribute 'Name'
+How can I get the duplicates based on its name ?
+The expected result is
+Name: KL, ID: af57c454, Owner: Scripter, Created: 2019-10-08 12:57:45, Modified: 2019-10-08 12:57:45
+Name: KL, ID: dfsjd5s4, Owner: d011, Created: 2020-10-27 21:02:54, Modified: 2020-10-27 21:02:54
+Name: MD, ID: sd5425, Owner: fsdd, Created: 2021-02-02 11:43:15, Modified: 2021-02-02 11:43:15
+Name: MD, ID: vcxb65, Owner: dsff1, Created: 2020-06-17 10:56:36, Modified: 2020-06-17 10:56:36
+Name: MD, ID: dhx865, Owner: ws07, Created: 2019-10-08 15:17:59, Modified: 2019-10-08 15:17:59
+
","df it's a string, not a dataframe, you should create a dataframe with the results of the API query and then you could use de dataframe methods.
+You can create a list with the results and then initialize the dataframe witht it.
+def query_maps(self):
+ api_query_result = self.portal.content.search(query=self.query_, item_type=self.item_type)
+ data = []
+
+ for l in api_query_result:
+ l_created_time = datetime.fromtimestamp(round(l.created / 1000))
+ l_modified_time = datetime.fromtimestamp(round(l.modified / 1000))
+ data.append({"Name": l.title, "ID": l.id, "Owner": l.owner, "Created": str(l_created_time), "Modified": str(l_modified_time)})
+ df = pd.Dataframe(data)
+
",python
+"Not getting number when crawling number of new COVID cases through BeautifulSoupGood evening,
+I am currently trying to crawl the South Australia's covid case number from the website (https://www.covid-19.sa.gov.au/home/dashboard).
+I found the the numbers are under
+<div id="convid19-data-visual" class="twbs">
+<div class="container">
+ <div class="row southaus">
+ <div clsass="col-md-6 col-lg-4" style="padding:10px 25px">
+ <div class="st">
+ "New Cases"
+ <span class="nCasesa majorNum">64</span>
+ </div>
+ </div>
+ </div>
+</div>
+
+
+Thus, I tried to crawl the number by applying the following code:
+import requests
+from bs4 import BeautifulSoup
+
+result = requests.get("https://www.covid-19.sa.gov.au/home/dashboard")
+soup = BeautifulSoup(result.text, "html.parser")
+cases = soup.find("div", {"class" : "st"}
+st = cases.find_all("span")
+print(st)
+
+and I got result of
+[<span class="nCasesa majorNum"> </span>]
+
+which does not include the case number.
+I had tried with selenium as well, but I was not able to get the case number either. I'm now confused whether the HTML tag that I found is right.
+If possible, would it be able to be fixed by getting right HTML tag?
+Thanks!
","The text associated with the span element of class nCasesa is loaded dynamically (JavaScript) and there is a delay in rendering the actual value in your browser. What you need to do (with Selenium) is to detect a change to the text. You can do it like this:
+from selenium import webdriver
+from selenium.webdriver.common.by import By
+from selenium.webdriver.support.ui import WebDriverWait
+
+CLASS = 'nCasesa'
+options = webdriver.ChromeOptions()
+options.add_argument('--headless')
+
+class detect():
+ def __init__(self, locator, params):
+ self.locator = locator
+ self.params = params
+ self.text = None
+
+ def gettext(self, driver):
+ return driver.find_element(self.locator, self.params).text
+
+ def __call__(self, driver):
+ if self.text is None:
+ self.text = self.gettext(driver)
+ else:
+ current = self.gettext(driver)
+ if current != self.text:
+ self.text = current
+ return True
+ return False
+
+with webdriver.Chrome(options=options) as driver:
+ driver.get(f'https://www.covid-19.sa.gov.au/home/dashboard')
+ detector = detect(By.CLASS_NAME, CLASS)
+ WebDriverWait(driver, 10).until(detector)
+ print(detector.text)
+
+Output:
+73
+
",python
+"Adding XML Source to xlsx file in pythonI am trying to create a xlsx from a template exported from Microsoft dynamics NAV, so I can upload my file to the system.
+I am able to recreate and fill the template using the library xlsxwriter, but unfortunately I have figured out that the template file also have an attached XML source code file(visible in the developer tab in Excel).
+I can easily modify the XML file to match what I want, but I can't seem to find a way to add the XML source code to the xlsx file.
+I have searched for "python adding xlsx xml source" but it doesn't seem to give me anything I can use.
+Any help would be greatly appreciated.
+Best regards
+Martin
","Xlsx file is basically a zip archive. Open it as archive and you'll probably be able to find the XML file and modify it. –
+Mak Sim
+yesterday
",python
+"How do I Anchor Text and Shrink it to fit it on an ImageI fount this code off of the PIL API(here is the link: https://pillow.readthedocs.io/en/stable/handbook/text-anchors.html) and I wanted to also shrink it depending on the size of the text while it is centered.
+here is the anchoring code
+from PIL import Image, ImageDraw, ImageFont
+
+font = ImageFont.truetype("mont.ttf", 48)
+im = Image.new("RGB", (200, 200), "white")
+d = ImageDraw.Draw(im)
+d.text((100, 100), "Quick", fill="black", anchor="ms", font=font)
+im.save('text.png')
+
+And the outcome looks like this:
+![]()
+But if you increase the word size it looks like this:
+![]()
+So I just want the text to be centered and shrunk to fit the image
","No detail about the requirements, so here only for result image with fixed size (200, 200), so font size will be changed.
+
+- Find the size of text by
ImageDraw.textsize
+- Draw on an image with same width as the text by
ImageDraw.text
+- Resize image to (200-2*border, 200-2*border) by
Image.resize
+- Paste the resized image to a 200x200 image by
Image.paste
+
+from PIL import Image, ImageDraw, ImageFont
+
+def text_to_image(text, filename='text.png', border=20):
+ im = Image.new("RGB", (1, 1), "white")
+ font = ImageFont.truetype("calibri.ttf", 48)
+ draw = ImageDraw.Draw(im)
+ size = draw.textsize(text, font=font)
+ width = max(size)
+ im = Image.new("RGB", (width, width), "white")
+ draw = ImageDraw.Draw(im)
+ draw.text((width//2, width//2), text, anchor='mm', fill="black", font=font)
+ im = im.resize((200-2*border, 200-2*border), resample=Image.LANCZOS)
+ new_im = Image.new("RGB", (200, 200), "white")
+ new_im.paste(im, (border, border))
+ new_im.show()
+ # new_im.save(filename)
+
+text_to_image("Hello World")
+
+![]()
",python
+"Measure image sharpness with opencv using gpuI created a small script that extract the most sharp image from a set of images using Laplacian like that:
+sharpness = cv2.Laplacian(cv2.imread(path), cv2.CV_64F).var()
+
+However the code is a bit slow and it seems to only use CPU, then I'm wondering if there's a method that uses the gpu to calculate that value, but only find example to sharpen an image.
","Don't optimize before you know what is taking time.
+Most time is spent on loading the image. Time it, you'll see. This involves accessing mass storage and decoding the image format. PNG isn't the most complex out there, so it could be worse.
+The laplacian calculation uses a specific kernel. Convolving the picture with an arbitrary 3x3 kernel would cost 9 multiplications and 9 additions. This kernel costs one shift and five adds/subs. The CPU's SIMD will eat this for breakfast.
+A GPU won't help at all. It takes time to transfer this data to the GPU. Then there are other constant costs (latency, "warm-up") to starting any calculations on a GPU. A CPU would already be done calculating. If you had a ton of pictures, at least the transfer could be pipelined and the upload of kernel code would only be required once.
+Both the GPU and the CPU are likely memory-bound in this entire operation, meaning compute capability is far from challenged by this.
+If you really wanted to get a GPU involved, the easiest way would be to wrap the numpy array in a cv.UMat and pass the UMat object in instead. OpenCV will then use OpenCL. The result will be a UMat again, so you would need to see what OpenCV function can calculate the variance for you.
+h_im = cv.imread(...) # hostside data
+d_im = cv.UMat(im) # usable on "device"
+d_lap = cv.Laplacian(d_im, cv.CV_32F) # single floats are usually faster than doubles
+h_lap = d_lap.get() # retrieve data
+# numpy functions unavailable on UMat, hence hostside calculation
+var = h_lap.var()
+# try cv.meanStdDev, calculates for each channel
+
",python
+"Modifying the Grid in matplotlib.pyplot graphI am a newbie to Python but slowly getting there. I am having a problem trying to increase the number of grid lines on a graph. Basically, the Graph is labelled 0-24 (Hours) but the x axis only generates a label every 5 hours (O,5,10,15,20) with a grid line at each of those majors. Ideally, I would like a grid line every hour as I am collecting real time data.
+Most of this code has been lifted from various sources, but the one thing that has stumped me is how to configure the grid..
+Edit - As requested my simplified code is below..
+import numpy as np
+import matplotlib.pyplot as plt
+import time
+
+timedata=[0.01,1.1,2.2,3.3,4.4,5.55,6.6,7.7,8.8,9.1,10.2,11.2,12.2,13.2,14.1,15.2,16.1,17.2,18.1,19.2,20.1,21.1,22.2,23.1]
+#timedata is in decimal hours
+bxdata=[10,10,20,20,20,30,30,30,40,40,40,30,30,30,20,20,30,30,20,20,40,50,30,24]
+bydata=[20,10,20,30,20,30,30,30,5,40,40,30,5,30,20,20,30,35,20,20,5,50,30,24]
+
+#draw the graph
+fig, ax = plt.subplots(sharex=True, figsize=(12, 6))
+x=np.arange(0,24,1)
+
+ax.plot(timedata,bxdata, color='red', label='Bx',lw=1)
+ax.plot (timedata, bydata, color='blue', label = 'By',lw=1)
+ax.set_xlim(0,24)
+ax.set_ylim(-250,250)
+
+plt.ion()
+plt.xlabel("Time (Hours)")
+plt.ylabel("nT")
+plt.grid(True, which='both')
+plt.legend()
+plt.show()
+image = "test.png"
+time.sleep(2)
+plt.savefig(image)
+plt.close('all')
+
+and this is the graph that I get.
+![]()
","The idea is to associate a locator to the minor x-axis ticks, the locator you need is MultipleLocator and we use it also to fix the major ticks' spacing (for hours, 6 is better than 5, isn't it?)
+![]()
+import numpy as np
+import matplotlib.pyplot as plt
+from matplotlib.ticker import MultipleLocator
+
+y = np.random.rand(25)
+plt.plot(y)
+
+plt.gca().xaxis.set_major_locator(MultipleLocator(6))
+plt.gca().xaxis.set_minor_locator(MultipleLocator(1))
+
+plt.grid()
+plt.grid(True, 'minor', color='#ddddee') # use a lighter color
+
+plt.show()
+
",python
+"Speeding up python when using nested for and if loopsI have a csv file that has a column called "Authors". In that column, each row has a couple of authors separated by commas. In the code below the function, getAuthorNames gets all the author names in that column and returns an array with all their names.
+Then the function authCount counts how many times an individual name appears in the Author column. At first, I was doing it with a couple of hundred rows and had no issues. Now I am trying to do it with 20,000 rows+ and it has taken a couple of hours and still no results. I believe it is the nested for loops and if statement that is causing it to take so long. Any advice on how to speed up the process would help. Should I be using lambda? Is there a built it pandas function that could help?
+This is what the input data looks like:
+Title,Authors,ID
+XXX,"Wang J, Wang H",XXX
+XXX,"Wang J,Han H",XXX
+
+And this is what the output would look like
+Author,Count
+Wang J,2
+Wang H,1
+Han H,1
+
+Here is the code:
+ import pandas as pd
+
+
+ df = pd.read_csv (r'C:\Users\amos.epelman\Desktop\Pubmedpull3GC.csv')
+
+
+ def getAuthorNames(dataFrame):
+ arrayOfAuthors = []
+ numRows = dataFrame.shape[0]
+
+ cleanDF = dataFrame.fillna("0")
+
+ for i in range (0,numRows):
+ miniArray = cleanDF.at[i,"Authors"].split(",")
+ arrayOfAuthors += miniArray
+
+ return arrayOfAuthors
+
+
+ def authCount(dataFrame):
+ authArray = getAuthorNames(dataFrame)
+ numAuthors = len(authArray)
+ countOfAuth = [0] * numAuthors
+
+ newDF = pd.DataFrame({"Author Name": authArray, "Count": countOfAuth})
+ refDF = dataFrame.fillna("0")
+
+
+ numRows= refDF.shape[0]
+
+
+ for i in range (0,numAuthors):
+ for j in range (0,numRows):
+ if newDF.at[i, "Author Name"] in refDF.at[j,"Authors"]:
+ newDF.at[i,"Count"] += 1
+
+ sortedDF = newDF.sort_values(["Count"], ascending = False)
+
+ noDupsDF = sortedDF.drop_duplicates(subset ="Author Name", keep = False)
+
+ return noDupsDF
+
+
+
+
+ finalDF = authCount(df)
+ file_name = 'GC Pubmed Pull3 Author Names with Count.xlsx'
+ finalDF.to_excel(file_name)
+
","you could try using Counter and a lambda function to eliminate your nested for loop over two dataframes, which seems like it would be a slow way to add a new column
+from collections import Counter
+
+Then to get the "Counts" column
+author_counts = Counter(list(refDF["Authors"]))
+
+newDF["Count"] = newDF.apply(lambda r: author_counts[r["Author Name"]], axis=1)
+
",python
+"JSON serialization of dictionary with complex objectsI am trying to serialize the dictionary playersElo for saving/loading it as/from JSON.
+But as it's not a serializable object and I can't find a way to do it.
+playersElo={} # dictionary of {<int> : <PlayerElo>}
+playersElo[1] = PlayerElo()
+playersElo[2] = PlayerElo()
+...
+
+class PlayerElo:
+ """
+ A class to represent a player in the Elo Rating System
+ """
+ def __init__(self, name: str, id: str, rating):
+ self.id = id
+ self.name = name
+ # comment the 2 lines below in order to start with a rating associated
+ # to current player rank
+ self.eloratings = {0: 1500}
+ self.elomatches = {0: 0}
+ self.initialrating = rating
+
+
","Maybe this can be a starting spot for you. The serializer grabs the __dict__ attribute from the object and makes a new dict-of-dicts, then writes it to JSON. The deserializer creates a dummy object, then updates the __dict__ on the way in.
+import json
+
+class PlayerElo:
+ """
+ A class to represent a player in the Elo Rating System
+ """
+ def __init__(self, name: str, id: str, rating):
+ self.id = id
+ self.name = name
+ self.eloratings = {0: 1500}
+ self.elomatches = {0: 0}
+ self.initialrating = rating
+
+
+playersElo={} # dictionary of {<int> : <PlayerElo>}
+playersElo[1] = PlayerElo('Joe','123',999)
+playersElo[2] = PlayerElo('Bill','456',1999)
+
+def serialize(ratings):
+ newdict = {i:j.__dict__ for i,j in ratings.items()}
+ json.dump( newdict, open('x.json','w') )
+
+def deserialize():
+ o = json.load(open('x.json'))
+ pe = {}
+ for k,v in o.items():
+ obj = PlayerElo('0','0',0)
+ obj.__dict__.update( v )
+ pe[int(k)] = obj
+ return pe
+
+print(playersElo)
+serialize( playersElo )
+pe = deserialize( )
+print(pe)
+
",python
+"await vs yield from for pythonhttps://stackoverflow.com/a/44273861/433570
+says 'yield from' is old and we should learn 'await'.
+But It doesn't say they are the same thing or they are different.
+But I have some good book and videos which talks about yield from
+Can I think yield from was replaced by await? and they are essentially the same thing?
+https://www.youtube.com/watch?v=MCs5OvhV9S4
+I have a book fluent python which also talks about yield from
+
+When I see some good books/videos like the above talking about yield from, could I substitute yield from with await in my mind?
","Yes they are the same thing. Yield is the manual way of doing async await. See https://www.python.org/dev/peps/pep-0492/#new-coroutine-declaration-syntax where they detail that async await is just a coroutine (yield) underneath
",python
+"How to convert list of jsons in a nested list of jsons to csv in python?How do I convert a list of jsons in a nested list of jsons to be in the following format? Can't seem to get this right and many examples use pandas where as I'd prefer to use csv.DictWriter. My thoughts are to (in a loop) read the json - in this case data, transpose it for it to be horizontal.
+{"rows": [
+ {
+ "data": [
+ {
+ "A": "1",
+ "B": "2"
+ },
+ {
+ "C": "3",
+ "D": "4"
+ },
+ {
+ "E": "5",
+ "F": "6"
+ }
+ ]
+ },
+ ...
+ ...
+ ...
+ {
+ "data": [
+ {
+ "A": "7",
+ "B": "8"
+ },
+ {
+ "C": "9",
+ "D": "10"
+ },
+ {
+ "E": "11",
+ "F": "12"
+ }
+ ]
+ }
+]
+}
+
+Desired format:
+A, B, C, D, E, F
+1, 2, 3, 4, 5, 6
+...
+7, 8, 9, 10, 11, 12
+
+
+I've read the json already using json.loads. Just stuck on converting this bit.
","DictWriter is pretty straightforward with a dict comprehension to generate the row data:
+import json
+import csv
+
+json_str = '''{"rows": [{"data": [{"A": "1", "B": "2"},
+ {"C": "3", "D": "4"},
+ {"E": "5", "F": "6"}]},
+ {"data": [{"A": "7", "B": "8"},
+ {"C": "9", "D": "10"},
+ {"E": "11", "F": "12"}]}]}'''
+
+data = json.loads(json_str)
+with open('out.csv','w',newline='') as f:
+ w = csv.DictWriter(f,fieldnames='ABCDEF')
+ w.writeheader()
+ for row in data['rows']:
+ data = {k:v for d in row['data'] for k,v in d.items()}
+ print(data)
+ w.writerow(data)
+
+Output (dict comprehension result):
+{'A': '1', 'B': '2', 'C': '3', 'D': '4', 'E': '5', 'F': '6'}
+{'A': '7', 'B': '8', 'C': '9', 'D': '10', 'E': '11', 'F': '12'}
+
+out.csv:
+A,B,C,D,E,F
+1,2,3,4,5,6
+7,8,9,10,11,12
+
",python
+"I keep getting ""Ran out of input"" error in for() function to open and adjust pickle fileI was making stock data MACD calculator in python. The way of my approach is using 'for()' to access pickle datas in certain directory and calculate MACD values one by one. However, I got 'Ran out of input error' everytime. I checked my directory where pickle datas are stored and it was not empty. Funny thing is, If I just put numbers in i position without using 'for()', I could get data of the pickle file. Please help me to get free from this error.
+Here's my code:
+'''
+import pickle
+import os
+import pathlib
+from pathlib import Path
+
+file_list = os.listdir('/home/sejahui/projects/pickle_data')
+os.chdir('/home/sejahui/projects/pickle_data')
+
+for i in range(2):
+ odd = file_list[i]
+ with open(odd,'rb') as stock:
+ data = pickle.load(stock)
+ print(data)
+
+'''
","In this case, actually, you create a list with file_list = os.listdir('/home/sejahui/projects/pickle_data'). You do not need range to iterate.
+The error is because sometimes there is only 1 file so it will be out of the index. The correct way is like this:
+import pickle
+import os
+import pathlib
+from pathlib import Path
+
+file_list = os.listdir('/home/sejahui/projects/pickle_data')
+os.chdir('/home/sejahui/projects/pickle_data')
+
+for str in file_list:
+
+ with open(str,'rb') as stock:
+ data = pickle.load(stock)
+ print(data)
+
+You can even add an if statement to the filter based on a regex or pattern
+if str == "something":
+ with open(str,'rb') as stock:
+ data = pickle.load(stock)
+
",python
+"Understanding .fork() with multiprocessing and loggingI am running a multiprocessing pool which logs to a logger that is configured by the main parent process. In one scenario, I have
+def init():
+ global LOG
+ LOG = logging.getLogger(__name__)
+ LOG.setLevel(logging.DEBUG)
+
+def main():
+ LOG.info("test")
+
+if __name__ == '__main__':
+ fmt = "%(asctime)s %(name)s %(levelname)s: %(message)s"
+ logging.basicConfig(format=fmt, datefmt='%m/%d/%Y %I:%M:%S %p')
+ with multiprocessing.Pool(initializer = init) as pool:
+ pool.map(main, inputs)
+
+Nothing gets printed to the stdout of my main calling thread. However, if I do this:
+fmt = "%(asctime)s %(name)s %(levelname)s: %(message)s"
+logging.basicConfig(format=fmt, datefmt='%m/%d/%Y %I:%M:%S %p')
+
+def init():
+ global LOG
+ LOG = logging.getLogger(__name__)
+ LOG.setLevel(logging.DEBUG)
+
+def main():
+ LOG.info("test")
+
+if __name__ == '__main__':
+ with multiprocessing.Pool(initializer = init) as pool:
+ pool.map(main, inputs)
+
+Then I do have proper logging. I don't see why these two things are different. When python forks into worker processes, the child process should be identical to the parent process so there shouldn't be a distinction between whether we called logging.basicConfig in the global namespace vs the main execution block. Can someone clarify?
","Your code is perfectly fine for Linux-es (except the fact that it won't even run, haha)
+However for Windows and OSX multiprocessing would spawn a fresh interpeter (see the docs) and would try to import the target module. This is the fork (pun intended) point where your __name__ == '__main__' guard makes a difference -- your logging setup is not done at all in the spawned workers!
",python
+"Is there a way to know the width of a character in turtle graphics?I'm currently making a typing game in turtle graphics. As the user types, an arrow above the sentence they are typing moves. This is used to show the user where they are in the sentence. However, I'm having trouble making the arrow stay above the letter that the user is actually on. It seems that the width of each letter isn't a constant measurement. Therefore, this will not work:t.forward(any number here) Is there any way to know the width of every single letter in a sentence? Or is there a certain font type that has a constant width no matter the character? I apologize if this does not make sense. Thanks
","There are a couple of ways you can look at this:
+To know width of character you just wrote
+You can do this by always lifting the pen prior to writing a character, and asking the turtle to move with the character. If you record the turtle position before and after writing, then you can work out and return the width:
+ import turtle
+ def write_character(t: turtle, char: str, font: str = "Arial") -> float:
+ """Write character and return width"""
+ pen_was_down = t.isdown()
+ if pen_was_down:
+ t.penup()
+ x_start, _ = t.position()
+ t.write(char, move=True, font=(font, 50, "normal"))
+ x_end, _ = t.position()
+ if pen_was_down:
+ t.pendown()
+ return x_end - x_start
+
+This way you can then move your arrow forward by the width returned.
+To use a fixed width character
+As you've found out some fonts do not have a fixed with. Some do however - these are called monospaced. One commonly used example of this is Courier. The following snippet uses the previous method to examine the distribution of character widths for both Arial and Courier fonts:
+print("Arial", Counter(
+ [write_character(turtle.Turtle(), c, "Arial") for c in string.ascii_lowercase]
+))
+print("Courier", Counter(
+ [write_character(turtle.Turtle(), c, "Courier") for c in string.ascii_lowercase]
+))
+
+which outputs:
+Arial Counter({27.0: 11, 24.0: 7, 11.0: 3, 13.0: 2, 41.0: 1, 16.0: 1, 36.0: 1})
+Courier Counter({30.0: 26})
+
+Here we can see that Arial has 7 different character widths, whilst Courier has only a single character width.
",python
+"Traversing a BST in an in-order fashion and storing the nodes in an array, but the output is not sorted?I am traversing a Binary Tree in an in-order manner to determine if it is a Binary Search Tree. Here is my code:
+class T:
+ def __init__(self, value, left=None, right=None):
+ self.value = value
+ self.left = left
+ self.right = right
+
+def is_bst(node):
+ if not node:
+ return
+
+ output = []
+
+ is_bst(node.left)
+ output.append(node.value)
+ is_bst(node.right)
+ print(output)
+
+![]()
+For the above example, the output = [1,3,2,9,7,5]
+But this is clearly not correct! - I would usually debug but I am not familiar with running trees/binary trees as inputs. Any idea where my code is going wrong???
+
+Updated code:
+class T:
+ def __init__(self, value, left=None, right=None):
+ self.value = value
+ self.left = left
+ self.right = right
+
+
+def inOrderTraversal(node,output):
+ if not node:
+ return None
+
+ is_bst(node.left)
+ output.append(node.value)
+ is_bst(node.right)
+ return
+
+def is_bst(node):
+
+ output = []
+
+ inOrderTraversal(node,output)
+
+ print(output)
+
+For the same example, the output = [1,3,2,9,7,5] is still wrong
","You create your output inside the routine, so it's always empty. Then you add the current node's value but you print it at the end of the routine. The result is postorder, not inorder - each node is printed after both its subtrees.
+Apart from the code structure your function has wrong name - you actually don't want it to answer whether the tree is BST, you just want it to return the contents:
+def dump_tree_inorder(node, output):
+ if not node:
+ return
+
+ dump_tree_inorder(node.left, output)
+ output.append(node.value)
+ dump_tree_inorder(node.right, output)
+
",python
+"How to convert a NetworkX graph with complex weights to a matrix?I have a graph whose weights are complex numbers. networkx has a few functions for converting the graph to a matrix of edge weights, however, it doesn't seem to work for complex numbers (though the reverse conversion works fine). It seems to require either int or float edge weights in order to convert them into a NumPy array/matrix.
+Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46)
+Type 'copyright', 'credits' or 'license' for more information
+IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.
+
+In [1]: import numpy as np
+
+In [2]: import networkx as nx
+
+In [3]: X = np.random.normal(size=(5,5)) + 1j*np.random.normal(size=(5,5))
+
+In [4]: X
+Out[4]:
+array([[ 1.64351378-0.83369888j, -2.29785353-0.86089473j,
+...
+...
+ 0.50504368-0.67854997j, -0.29049118-0.48822688j,
+ 0.22752377-1.38491981j]])
+
+In [5]: g = nx.DiGraph(X)
+
+In [6]: for i,j in g.edges(): print(f"{(i,j)}: {g[i][j]['weight']}")
+(0, 0): (1.6435137789271903-0.833698877745345j)
+...
+(4, 4): (0.2275237661137745-1.3849198099771993j)
+
+# So conversion from matrix to nx.DiGraph works just fine.
+# But the other way around gives an error.
+
+In [7]: Z = nx.to_numpy_array(g, dtype=np.complex128)
+---------------------------------------------------------------------------
+TypeError Traceback (most recent call last)
+<ipython-input-7-b0b717e5ec8a> in <module>
+----> 1 Z = nx.to_numpy_array(g, dtype=np.complex128)
+
+~/miniconda3/envs/coupling/lib/python3.9/site-packages/networkx/convert_matrix.py in to_numpy_array(G, nodelist, dtype, order, multigraph_weight, weight, nonedge)
+ 1242 for v, d in nbrdict.items():
+ 1243 try:
+-> 1244 A[index[u], index[v]] = d.get(weight, 1)
+ 1245 except KeyError:
+ 1246 # This occurs when there are fewer desired nodes than
+
+TypeError: can't convert complex to float
+
+I have looked at the documentation and all it seems to say is that this works only for a simple NumPy datatype and for compound types, one should use recarrays. I don't understand recarrays well and using np.to_numpy_recarray also yields an error.
+In [8]: Z = nx.to_numpy_recarray(g, dtype=np.complex128)
+...
+TypeError: 'NoneType' object is not iterable
+
+So the question is how to convert the graph into a matrix of edge weights correctly?
","Below is a quick hack that could be useful until a fix is implemented:
+import networkx as nx
+import numpy as np
+
+
+def to_numpy_complex(G):
+
+ # create an empty array
+ N_size = len(G.nodes())
+ E = np.empty(shape=(N_size, N_size), dtype=np.complex128)
+
+ for i, j, attr in G.edges(data=True):
+ E[i, j] = attr.get("weight")
+
+ return E
+
+
+X = np.random.normal(size=(5, 5)) + 1j * np.random.normal(size=(5, 5))
+
+g = nx.DiGraph(X)
+
+Y = to_numpy_complex(g)
+
+print(np.allclose(X, Y)) # True
+
",python
+"Python Virtualenv not creating the new environment in the directory I amFirst of all, to put in context:
+
+I have installed Python 3.9 which comes from Visual Studio 2019
+
+I have installed Python 3.8 from Microsoft Store which installs it in the path:
+C:\Users\username\AppData\Local\Microsoft\WindowsApps
+
+
+Now I want to create a virtual environment for Python 3.8 so I can switch to it whenever I need. So I follow below steps (below commands are all executed from path C:\Users\username\AppData\Local):
+
+Installing virtualenv:
+py -3.8 -m pip install virtualenv
+
+
+![]()
+
+Creating new virtual environment for Python 3.8:
+py -3.8 -m virtualenv _venv38.win32
+
+
+![]()
+And what's the surprise? Folder _venv38.win32 is not created within the directory I am which is C:\Users\username\AppData\Local
+Instead _venv38.win32 is created in:
+C:\Users\username\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\Local\_venv38.win32
+
+So why? I want it to be created in:
+C:\Users\username\AppData\Local\_venv38.win32
+
+which is the path from where I have executed the command (step 2)
","use
+py -3.8 -m venv _venv38.win32
+
+this will create venv at cwd.
+virtualenv has a custom "remote" location for virtual environments somewhere outside your project
",python
+"Adding a list of strings to an existing key in a dictI am trying to add strings to a list(adding new strings in each loop with .append()), after that adding the list to an existing key. The problem is after adding the new string to the list and than to a key in console get printed [...](how to get rid of this [...]) example:
+x = {}
+y = ["going home"]
+x["key"].append(y)
+y.append("after lunch")
+x["key"].append(y)
+print(x)
+{'key' : ['going home', 'after lunch', [...]]}
+
+Thank you for your time
","Maybe the correct behavior is:
+x = {}
+y = ["going home"]
+x["key"] = y
+y.append("after lunch")
+print(x)
+
+{'key': ['going home', 'after lunch']}
+
",python
+"Error in getting values from python dictionaryI've created a python dictionary
+user = input('enter a letter: ')
+d = {'a': '1', 'b': '2', 'c':'3'}
+print(d.get(d))
+
+I run that and enterned
+a
+
+But I always get an error like this
+er a letter: a
+Traceback (most recent call last):
+ File "/data/user/0/ru.iiec.pydroid3/files/accomp_files/iiec_run/iiec_run.py", line 31, in <module>
+ start(fakepyfile,mainpyfile)
+ File "/data/user/0/ru.iiec.pydroid3/files/accomp_files/iiec_run/iiec_run.py", line 30, in start
+ exec(open(mainpyfile).read(), __main__.__dict__)
+ File "<string>", line 3, in <module>
+TypeError: unhashable type: 'dict'
+
+[Program finished]
+
+Help me to slove!
","You have to use the variable with user's input:
+user = input('enter a letter: ')
+d = {'a': '1', 'b': '2', 'c':'3'}
+print(d.get(user))
+
",python
+"Update Global Variables Input as Parameters Rather Than Returning Results From FunctionGoal
+I am trying to write a function where one or more of the input parameters is a global variable that is updated by the function, without having to return values from within the function. I am aware I could just return a tuple or two separate values from the function, but I think updating the global variables from within the function would be another interesting method if it is possible.
+Reason to do this
+Updating global variables with a function is easy when the global variable is known (ie. defined previously in the python script). However, I want to define the function in a separate .py file to easily use the function within other python scripts. Therefore, I need to be able to support different variable names to update.
+While this is not at all necessary, I am just interested if this is even possible.
+Example Pseudocode
+I'm thinking something like this:
+def math_function(input_val, squared_result, cubed_result):
+ squared_result = input_val**2 #update the var input as the squared_result parameter
+ cubed_result = input_val**3 #update the var input as the cubed_result parameter
+
+where you would input a number for input_val and then global variables for squared_result and cubed_result that the function updates with the result. It would then theoretically work like:
+#Declare global variables
+b = 0
+c = 0
+
+#then somewhere in the code, call the function
+math_function(2, b, c)
+
+#check the new values
+print(b) #Output: b = 4
+print(c) #Output: c = 8
+
+This would allow me to use the function in different python scripts without having to worry about what order the results are returned in.
","First: I am in no way advocating this.
+You could use the globals builtin function to access a global variable by name:
+def gtest(name,value):
+ globals()[name] = value
+
+gtest('new_global','new_value')
+print(new_global)
+
",python
+"How to prefill django form DynamicallyI am trying to display a form in django and pre-filling it dynamically.
+I want the user of my sample news gathering site to modify an entry.
+I have my Manual Input form class
+#forms.py
+class ManualInputForm(forms.Form):
+ source = forms.CharField(label="Source:", widget = forms.TextInput(attrs={'size': 97}))
+ topic = forms.CharField(label="Topic:", widget = forms.TextInput(attrs={'size': 97}))
+ news = forms.CharField(widget = forms.Textarea(attrs={"rows":5, "cols":100}))
+ link = forms.CharField(label="Link (optional):", required = False, widget = forms.TextInput(attrs={'size': 97}))
+
+In the HTML I am going manually because I would like to pre-fill all fields with data coming in from the related function in views.py.
+#html file
+<form method="post" class="form-group">
+ {% csrf_token %}
+ <div class="input-group mb-3">
+ <div class="container">
+ {% for field in form %}
+ <div class="fieldWrapper">
+ {{ field.errors }}
+ {{ field.label_tag }}
+ <br>
+ {{ field }}
+ </div>
+ {% endfor %}
+ </div>
+
+ <div class="input-group">
+ <p> </p>
+ <button type="submit" class="btn btn-success" name="Submit">Save</button>
+ </div>
+ </div>
+</form>
+
+How do I do it? It's driving me crazy o.O
+I would like to keep using django's forms because of its integrated error manager (not all fields are required but some are and I'd like for django to keep managing it).
+Thank your for your suggestions!
+EDIT:
+as requested I'll post the views.py related function:
+#views.py
+def editnews(response, id):
+ form = ManualInputForm(response.POST or None)
+
+ #tableToView is a dataframe retrieved by querying an external DB
+ #data cannot be stored in django's buit in because of reasons ;-)
+
+ #checking the dataframe is correct and it is:
+ #IT IS MADE OF A SINGLE LINE
+
+ print(tableToView)
+
+ #THIS IS PROBABLY NOT THE WAY TO DO IT
+ form.source = tableToView.loc[0, 'Source']
+ form.topic = tableToView.loc[0, 'Topic']
+ form.news = tableToView.loc[0, 'News']
+ form.link = tableToView.loc[0, 'Link']
+
+ return render(response, 'manual/editnews.html', {"form":form})
+
+In the image the text should be pre-filled.
+![]()
","Try something like that:
+def editnews(response, id):
+ data = {k.lower(): v for k, v in tableToView.loc[0].to_dict().items()}
+ form = ManualInputForm(response.POST or None, initial=data)
+ return render(response, 'manual/editnews.html', {'form': form})
+
",python
+"Matplotlib background matches vscode theme on dark mode and can't see axisI just got a new PC and downloaded visual studio code. I'm trying to run the exact same plots as the code I had on my other PC (just plt.plot(losses)) but now matplotlib seems to have a dark background instead of white:
+![]()
+I found this and this that had opposite problems.
+To clarify, I'm asking how to change the matplotlib background plots to white (note that in my other machine I didn't have to hard code any matplotlib background information so I think it's a visual studio problem, but couldn't figure it out)
","Difficult to be sure since I cannot reproduce your problem.
+Two things to try (both presume that you import matplotlib using import matplotlib.pyplot as plt):
+
+- if you use
plt.figure, add facecolor='white' parameter. Or try to run fig.set_facecolor('white') (fig here is the variable that stored the figure which facecolor you are changing. If you don't have any, use plt.gcf().set_facecolor('white') once the figure is created; gcf() returns current figure, see this doc).
+- Try to change
plt.style.context as in this matplotlib example.
+
",python
+"RuntimeError: Found dtype Double but expected Float - PyTorchI am new to pytorch and I am working on DQN for a timeseries using Reinforcement Learning and I needed to have a complex observation of timeseries and some sensor readings, so I merged two neural networks and I am not sure if that's what is ruining my loss.backward or something else.
+I know there is multiple questions with the same title but none worked for me, maybe I am missing something.
+First of all, this is my network:
+class DQN(nn.Module):
+ def __init__(self, list_shape, score_shape, n_actions):
+ super(DQN, self).__init__()
+
+ self.FeatureList = nn.Sequential(
+ nn.Conv1d(list_shape[1], 32, kernel_size=8, stride=4),
+ nn.ReLU(),
+ nn.Conv1d(32, 64, kernel_size=4, stride=2),
+ nn.ReLU(),
+ nn.Conv1d(64, 64, kernel_size=3, stride=1),
+ nn.ReLU(),
+ nn.Flatten()
+ )
+
+ self.FeatureScore = nn.Sequential(
+ nn.Linear(score_shape[1], 512),
+ nn.ReLU(),
+ nn.Linear(512, 128)
+ )
+
+ t_list_test = torch.zeros(list_shape)
+ t_score_test = torch.zeros(score_shape)
+ merge_shape = self.FeatureList(t_list_test).shape[1] + self.FeatureScore(t_score_test).shape[1]
+
+ self.FinalNN = nn.Sequential(
+ nn.Linear(merge_shape, 512),
+ nn.ReLU(),
+ nn.Linear(512, 128),
+ nn.ReLU(),
+ nn.Linear(128, n_actions),
+ )
+
+ def forward(self, list, score):
+ listOut = self.FeatureList(list)
+ scoreOut = self.FeatureScore(score)
+ MergedTensor = torch.cat((listOut,scoreOut),1)
+ return self.FinalNN(MergedTensor)
+
+I have a function called calc_loss, and at its end it return the MSE loss as below
+ print(state_action_values.dtype)
+ print(expected_state_action_values.dtype)
+ return nn.MSELoss()(state_action_values, expected_state_action_values)
+
+and the print shows float32 and float64 respectively.
+I get the error when I run the loss.backward() as below
+LEARNING_RATE = 0.01
+optimizer = optim.Adam(net.parameters(), lr=LEARNING_RATE)
+
+for i in range(50):
+ optimizer.zero_grad()
+ loss_v = calc_loss(sample(obs, 500, 200, 64), net, tgt_net)
+ print(loss_v.dtype)
+ print(loss_v)
+ loss_v.backward()
+ optimizer.step()
+
+and the print output is as below:
+torch.float64
+tensor(1887.4831, dtype=torch.float64, grad_fn=)
+Update 1:
+I tried using a simpler model, yet the same issue, when I tried to cast the inputs to Float, I got an error:
+RuntimeError: expected scalar type Double but found Float
+
+What makes the model expects double ?
+Update 2:
+I tried to add the below line on top after the torch import but same issue of RuntimeError: Found dtype Double but expected Float
+>>> torch.set_default_tensor_type(torch.FloatTensor)
+
+But when I used the DoubleTensor I got:
+RuntimeError: Input type (torch.FloatTensor) and weight type (torch.DoubleTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
","The issue wasn't in the input to the network but the criterion of the MSELoss, so it worked fine after casting the criterion to float as below
+return nn.MSELoss()(state_action_values.float(), expected_state_action_values.float())
+
+I decided to leave the answer for beginners like me who might be stuck and didn't expect to check the datatype of the loss criterion
",python
+"Python random module: How can I generate a random number which includes certain digits?I am trying to generate a random number in Python, but I need it to include certain digits.
+Let's say the range I want for it is between 100000 and 999999, so I want it to be in that range but also include digits like 1, 4, and 5.
+Is there a way to do this?
","you can build the number digit by digit
+>>> import random
+>>> def fun(required=(),size=6):
+ result = list(required)
+ n = size-len(result)
+ result.extend( random.randint(0,10) for _ in range(n)) # fill in the remaining digits
+ random.shuffle(result)
+ assert any(result) #make sure that there is at least one non zero digit
+ while not result[0]: #make sure that the first digit is non zero so the resulting number be of the required size
+ random.shuffle(result)
+ return int("".join(map(str,result)))
+
+>>> fun([1,4,5])
+471505
+>>> fun([1,4,5])
+457310
+>>> fun([1,4,5])
+912457
+>>> fun([1,4,5])
+542961
+>>> fun([1,4,5])
+145079
+>>>
+
",python
+"Column names are not recognized? How to set the column names?I have a dataset for which I am not able to call the columns. In the screen shoot below, I have marked in yellow what I need to be recognized as column (Vale On, Petroleo etc.) and the Date column, which I need to recognize as date since I am working with time series data.
+I have tried to reset index and some solutions related but nothing worked. I am new to Python, so I am sorry if it is too obvious.
+![]()
","# use first row as column names
+df.columns = df.iloc[0]
+
+# and then drop it
+df = df.iloc[1:]
+
+# convert first col to date
+# if it doesnt work, try passing format=... refer https://strftime.org/
+# also https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html
+df['Date'] = pd.to_datetime(df['Date'])
+
+A debugging hint if parsing the date keeps failing is to check if your date strings are consistent, perhaps like so: df['Date'].str.len().value_counts(). That should hopefully return only one length. If that returns multiple rows, that means you have inconsistent and anomalous data which you'll have to clean.
",python
+"how made cross-validation with python?Hi i made a neural network and i need to do a cross validation.
+I don't know how made that, specifically how train or made that.
+if someone knows made that please write or give me some indications.
+here is my code:
+###Division Train / Test
+X = df.drop('Peso secado',axis=1) #Variables de entrada, menos la variable de salida
+y = df['Peso secado'] #Variable de salida
+
+from sklearn.model_selection import train_test_split
+X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=101)
+
+###
+
+from sklearn.preprocessing import MinMaxScaler
+scaler = MinMaxScaler()
+X_train= scaler.fit_transform(X_train)
+X_train
+X_test = scaler.transform(X_test)
+X_test
+
+
+
+###Creacion del modelo###
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense, Activation
+from tensorflow.keras.optimizers import Adam
+import tensorflow as tf
+
+model = Sequential()
+num_neuronas = 50
+model.add(tf.keras.layers.Dense(units=6, activation='sigmoid', input_shape=(6, )))
+model.add(Dense(num_neuronas,activation='relu'))
+model.add(tf.keras.layers.Dense(units=1, activation='linear'))
+
+#Buscar mejor funcion de activacion para capa de salida sigmoid? o linear?
+model.summary()
+model.compile(optimizer='adam',loss='mse')
+
+###Entrenamiento###
+model.fit(x = X_train, y = y_train.values,
+ validation_data=(X_test,y_test.values), batch_size=10, epochs=1000)
+
+losses = pd.DataFrame(model.history.history)
+losses
+losses.plot()
+
+###Evaluacion###
+from sklearn.metrics import mean_squared_error,mean_absolute_error,explained_variance_score,mean_absolute_percentage_error
+X_test
+predictions = model.predict(X_test)
+mean_absolute_error(y_test,predictions)
+mean_absolute_percentage_error(y_test,predictions)
+
+mean_squared_error(y_test,predictions)
+explained_variance_score(y_test,predictions)
+
+mean_absolute_error(y_test,predictions)/df['Peso secado'].mean()
+mean_absolute_error(y_test,predictions)/df['Peso secado'].median()
+
+Some recomendation for training or validation would be helpful
","My first observation is that the code is pretty ugly and unstructured. You should import the modules on the top part of your code
+For performing cross validation first import the module from sklearn (and all other modules that you need)
+from sklearn.model_selection import StratifiedKFold
+
+I'd put the model definition in a separate function as such:
+def get_model():
+ model = Sequential()
+ model.add(Dense(4, input_dim=8, activation='relu'))
+ model.add(Dense(1, activation='sigmoid'))
+ model.compile(loss='binary_crossentropy', optimizer='adam')
+ return model
+
+Define your variables and if you are working with tensorflow / Keras, do something like this:
+BATCH_SIZE = 64 # 128
+EPOCHS = 100
+
+k = 10
+# Use stratified k-fold if the data is imbalanced
+kf = StratifiedKFold(n_splits=k, shuffle=False, random_state=None)
+
+# here comes the Cross validation
+fold_index = 1
+for train_index, test_index in kf.split(X, y):
+ X_train = X[train_index]
+ y_train = y[train_index]
+
+ X_test = X[test_index]
+ y_test = y[test_index]
+
+ # fit the model on the training set
+ model = get_model()
+
+ model.fit(
+ X_train,
+ y_train,
+ batch_size=BATCH_SIZE,
+ epochs=EPOCHS,
+ verbose=0,
+ validation_data=(X_test, y_test),
+ )
+
+ # predict values
+ # pred_values = model.predict(X_test)
+ pred_values_prob = np.array(model(X_test))
+
+Note: when working with tensorflow you need to define a new model every time in the loop. This is not the case with sklearn as sklearn starts with fresh initialized weights when called. Here you need to do that separately.
",python
+"boto3 gives error when trying to stop an AWS EC2 instance using AnsibleI am trying to create an ansible playbook to install docker & docker-compose on the host server, stop and start the AWS EC2 instance and then restart docker.
+Everything goes well until I try to stop the instance, then this happens:
+TASK [docker_setup : Gather facts] ******************************************************************************************************************************************
+[DEPRECATION WARNING]: The 'ec2_instance_facts' module has been renamed to 'ec2_instance_info'. This feature will be removed in version 2.13. Deprecation warnings can be
+disabled by setting deprecation_warnings=False in ansible.cfg.
+fatal: [172.31.25.50]: FAILED! => {"changed": false, "msg": "boto3 required for this module"}
+
+Those steps to stop the instance look like this on the playbook:
+- name: Install boto3 and botocore with pip3 module for Gather facts
+ pip:
+ name:
+ - boto3
+ - botocore
+ executable: pip-3.7
+
+- name: Gather facts
+ action: ec2_instance_facts
+
+- name: Stop myserver instance
+ local_action:
+ module: ec2
+ region: "{{region}}"
+ instance_ids: "{{ansible_ec2_instance_id}}"
+ state: stopped
+
+The reason I installed boto3 is because it was complaining for not being installed but even when installed it still gives an error. I also read around the Internet that I should add ansible_python_interpreter=/usr/bin/python on the host file next to each host and so I did. But it didn`t work. It looks like this:
+[webservers]
+172.31.25.50 ansible_python_interpreter=/usr/bin/python
+
+
+Any ideas? Thank you!
","What I did to solve this:
+Instead of using the ansible_python_interpreter in the hosts file, I learned that you can actually add it to a specific action of the task via vars:
+- name: Stop instance(s)
+ vars:
+ ansible_python_interpreter: /usr/bin/python3
+ ec2_instance:
+ aws_access_key: xxxxx
+ aws_secret_key: xxxxx
+ region: "{{region}}"
+ instance_ids: "{{ansible_ec2_instance_id}}"
+ state: stopped
+
+Also used python3 instead of python on the ansible_python_interpreter. If I used ansible_python_interpreter: /usr/bin/python3 on the hosts file as I was doing, it would give another error because the default interpreter for the whole task was using it, but this way you can direct it to when you want to use it.
",python
+"Which line has an error in this for loop?This question is from a python course on freeCodeCamp.com
+smallest = None
+print("Before:", smallest)
+for itervar in [3, 41, 12, 9, 74, 15]:
+ if smallest is None or itervar < smallest:
+ smallest = itervar
+ break
+ print("Loop:", itervar, smallest)
+print("Smallest:", smallest)
+
+There is a mistake in one of these lines. I thought it's the fourth line because the variable 'smallest' is already written as None in the first line but it's not the right answer. Also, what type of value is None and what is it for?
","You don't need a break on 5th line. It interrupts the loop which is not needed there.
+Without it everything works okay.
+
+Also, what type of value is None and what is it for?
+
+The None keyword is used to define a null value, or no value at all.
",python
+"pivot in pandas: how to edit the columns and rowsi face a problem with pivot in pandas ,
+the total_profit and numberofgoodsold columns are located above company row.
+i need the company row to be at the top.
+in each company the total_profit and the goodsold columns should came under.
+this is my code:
+data = {'company': ['AMC', 'ER','CRR' , 'TYU'], 'Reg-ID': ['1222','2334','3444', '4566'], 'Total_provit': ['123300','12233', '3444444', '412222'], 'numberofgoodsold':['44','23','67','34']}
+
+d = pd.DataFrame(data)
+
+
+
+d.pivot(index = 'Reg-ID', columns = 'company')
+
+
","Update, ok then I think this is what you need:
+data = {'company': ['AMC', 'ER','CRR' , 'TYU'], 'Reg-ID': ['1222','2334','3444', '4566'], 'Total_provit': ['123300','12233', '3444444', '412222'], 'numberofgoodsold':['44','23','67','34']}
+
+d = pd.DataFrame(data)
+
+d2 = d.pivot(index = 'Reg-ID', columns = 'company')
+
+d2.columns = d2.columns.swaplevel(0, 1)
+d2.sort_index(axis=1, level=0, inplace=True)
+
+d2
+
+Output:
+![]()
",python
+"How to avoid the QueuePool limit error using Flask-SQLAlchemy?I'm developing a webapp using Flask-SQLAlchemy and a Postgre DB, then I have this dropdown list in my webpage which is populated from a select to the DB, after selecting different values for a couple of times I get the "sqlalchemy.exc.TimeoutError:".
+My package's versions are:
+Flask-SQLAlchemy==2.5.1
+psycopg2-binary==2.8.6
+SQLAlchemy==1.4.15
+
+My parameters for the DB connection are set as:
+app.config['SQLALCHEMY_POOL_SIZE'] = 20
+app.config['SQLALCHEMY_MAX_OVERFLOW'] = 20
+app.config['SQLALCHEMY_POOL_TIMEOUT'] = 5
+app.config['SQLALCHEMY_POOL_RECYCLE'] = 10
+
+The error I'm getting is:
+sqlalchemy.exc.TimeoutError: QueuePool limit of size 20 overflow 20 reached, connection timed out, timeout 5.00 (Background on this error at: https://sqlalche.me/e/14/3o7r)
+
+After changing the value of the 'SQLALCHEMY_MAX_OVERFLOW' from 20 to 100 I get the following error after some value changes on the dropdown list.
+psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: FATAL: sorry, too many clients already
+
+Every time a new value is selected from the dropdown list, four queries are triggered to the database and they are used to populate four corresponding tables in my HTML with the results from that query.
+I have a 'db.session.commit()' statement after every single query to the DB, but even though I have it, I get this error after a few value changes to my dropdown list.
+I know that I should be looking to correctly manage my connection sessions, but I'm strugling with this. I thought about setting the pool timeout to 5s, instead of the default 30s in hopes that the session would be closed and returned to the pool in a faster way, but it seems it didn't help.
+As a suggestion from @snakecharmerb, I checked the output of:
+select * from pg_stat_activity;
+
+I ran the webapp for 10 different values before it showed me an error, which means all the 20+20 sessions where used and are left in an 'idle in transaction' state.
+Do anybody have any idea suggestion on what should I change or look for?
","I found a solution to the issue I was facing, in another post from StackOverFlow.
+When you assign your flask app to your db variable, on top of indicating which Flask app it should use, you can also pass on session options, as below:
+from flask_sqlalchemy import SQLAlchemy
+db = SQLAlchemy(app, session_options={'autocommit': True})
+
+The usage of 'autocommit' solved my issue.
+Now, as suggested, I'm using:
+app.config['SQLALCHEMY_POOL_SIZE'] = 1
+app.config['SQLALCHEMY_MAX_OVERFLOW'] = 0
+
+Now everything is working as it should.
+The original post which helped me is: Autocommit in Flask-SQLAlchemy
+@snakecharmerb, @jorzel, @J_H -> Thanks for the help!
",python
+"Returning list of different results that are created recursively in PythonLately I've been working with some recursive problems in Python where I have to generate a list of possible configurations (i.e list of permutations of a given string, list of substrings, etc..) using recursion. I'm having a very hard time in finding the best practice and also in understanding how to manage this sort of variable in recursion.
+I'll give the example of the generate binary trees problem. I more-or-less know what I have to implement in the recursion:
+
+- If n=1, return just one node.
+- If n=3, return the only possible binary tree.
+- For n>3, crate one node and then explore the possibilities: left node is childless, right node is childless, neither node is childless. Explore these possibilites recursively.
+
+Now the thing I'm having the most trouble visualising is how exactly I am going to arrive to the list of trees. Currently the practice I do is pass along a list in the function call (as an argument) and the function would return this list, but then the problem is in case 3 when calling the recursive function to explore the possibilites for the nodes it would be returning a list and not appending nodes to a tree that I am building. When I picture the recursion tree in my head I imagine a "tree" variable that is unique to each of the tree leaves, and these trees are added to a list which is returned by the "root" (i.e first) call. But I don't know if that is possible. I thought of a global list and the recursive function not returning anything (just appending to it) but the problem I believe is that at each call the function would receive a copy of the variable.
+How can I deal with generating combinations and returning lists of configurations in these cases in recursion? While I gave an example, the more general the answer the better. I would also like to know if there is a "best practice" when it comes to that.
","
+Currently the practice I do is pass along a list in the function call (as an argument) and the function would return this list
+
+This is not the purest way to attack a recursive problem. It would be better if you can make the recursive function such that it solves the sub problem without an extra parameter variable that it must use. So the recursive function should just return a result as if it was the only call that was ever made (by the testing framework). So in the example, that recursive call should return a list with trees.
+Alternatively the recursive function could be a sub-function that doesn't return a list, but yields the individual values (in this case: trees). The caller can then decide whether to pack that into a list or not. This is more pythonic.
+As to the example problem, it is also important to identify some invariants. For instance, it is clear that there are no solutions when n is even. As to recursive aspect: once you have decided to create a root, then both its left and right sided subtree will have an odd number of nodes. Of course, this is an observation that is specific to this problem, but it is important to look for such problem properties.
+Finally, it is equally important to see if the same sub problems can reoccur multiple times. This surely is the case in the example problem: for instance, the left subtree may sometimes have the same number of nodes as the right subtree. In such cases memoization will improve efficiency (dynamic programming).
+When the recursive function returns a list, the caller can then iterate that list to retrieve its elements (trees in the example), and use them to build an extended result that satisfies the caller's task. In the example case that means that the tree taken from the recursively retrieved list, is appended as a child to a new root. Then this new tree is appended to a new list (not related to the one returned from the recursive call). This new list will in many cases be longer, although this depends on the type of problem.
+To further illustrate the way to tackle these problems, here is a solution for the example problem: one which uses the main function for the recursive calls, and using memoization:
+class Solution:
+ memo = { 1: [TreeNode()] }
+
+ def allPossibleFBT(self, n: int) -> List[Optional[TreeNode]]:
+ # If we didn't solve this problem before...
+ if n not in self.memo:
+ # Create a list for storing the results (the trees)
+ results = []
+ # Before creating any root node,
+ # decide the size of the left subtree.
+ # It must be odd
+ for num_left in range(1, n, 2):
+ # Make the recursive call to get all shapes of the
+ # left subtree
+ left_shapes = self.allPossibleFBT(num_left)
+ # The remainder of the nodes must be in the right subtree
+ num_right = n - 1 - num_left # The root also counts as 1
+ right_shapes = self.allPossibleFBT(num_right)
+ # Now iterate the results we got from recursion and
+ # combine them in all possible ways to create new trees
+ for left in left_shapes:
+ for right in right_shapes:
+ # We have a combination. Now create a new tree from it
+ # by putting a root node on top of the two subtrees:
+ tree = TreeNode(0, left, right)
+ # Append this possible shape to our results
+ results.append(tree)
+ # All done. Save this for later re-use
+ self.memo[n] = results
+ return self.memo[n]
+
+This code can be made more compact using list comprehension, but it may make the code less readable.
",python
+"why can't I send messages from flask html to flask python?I have been working on a program in flask that allows you to search a database. I have not coded the actual finding stuff in the data base or anything but I don't need that right now, and the database does not effect anything right now. I have not been able to get past how to get what the user types in the form to the python program. It runs with no errors but when I check what I received I get None. Is there something I'm doing wrong? This is my code, it is very messy and just one file.
+main.py
+from flask import Flask, render_template
+from flask import request
+import pdfkit, time
+
+
+
+
+def go(letters):
+ data = open('data.txt','r')
+ return letters
+
+
+app = Flask(__name__)
+
+@app.route("/path/", methods=['GET','POST'])
+def search():
+ time.sleep(1)
+ data=request.get_data()
+ print(data)
+ return go(data)
+
+@app.route('/')
+def index():
+
+ return render_template('index.html')
+
+if __name__ == "__main__":
+ from waitress import serve
+ serve(app, host="0.0.0.0", port=8080)
+ app.run(debug=True)
+
+templates/index.html
+<!Doctype html!>
+
+<html land=en>
+<h1>Welcome!</h1><br>
+<p>Type to search the database.</p>
+<br><form name='this' onsubmit='letsgo()' class='text' action='/path/' method='post'><input id='hey' type='text'> <input type='submit' value='search'></form>
+<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
+<script>
+var test = document.getElementById('hey').value;
+ const xhr = new XMLHttpRequest();
+function letsgo() {
+
+ const data = document.getElementById("hey").value
+ alert(data)
+$.ajax({
+ type : 'POST',
+ url : "{{'https://file-encrypter.ashwinchera.repl.co/path/'}}",
+ dataType: 'data',
+ datas : {'data':data}
+});
+
+
+};
+</script>
+
+also I am working with a friend, so I don't know what some of this stuff is here for. Can someone tell me how I can send this data? I have been trying things from other questions, but they don't work.
+Thank you in advance!
","There are multiple issues here. As mentioned in the comments, I recommend working slowly and breaking the problem down into small pieces. Check that each piece works before rushing ahead and accumulating many errors that are hard to unravel.
+Most of the problems are on the front-end, so you'll want to use the browser console to inspect errors. You can also use an HTML validator tool to make sure your HTML makes sense and catches typos like land=en.
+Since it sounds like you want to POST without a page refresh and you're using jQuery, many properties on your form are unnecessary:
+onsubmit='letsgo()' action='/path/' method='post'
+
+can all be removed. While you're at it, remove any unused noise like:
+var test = document.getElementById('hey').value;
+ const xhr = new XMLHttpRequest();
+
+and unnecessary ids and classes. These are just adding to the confusion. When things don't make sense and aren't working, try stripping out code rather than adding it.
+"{{'https://file-encrypter.ashwinchera.repl.co/path/'}}" should just be /path so that it'll work on any domain such as a localhost. If you're working cross-origin, that's another story, but I don't think you are.
+In the $.ajax call, datas is a typo. That should be data.
+const data = document.getElementById("hey").value isn't necessary. If you're bothering to import jQuery, you might as well use it all the way: $("#hey").val(). #hey and letsgo are unclear names that don't make it any easier to debug the app.
+Use event.preventDefault() to prevent the form submission.
+On the backend, once again, remove any cruft and noise like the file read and import pdfkit, time. It seems strange to add GET to the list of accepted verbs for the /path route (which is too generically-named, as is go).
+Since you're using form data, request.get_data() can be request.form.get("data") where "data" is the key you want to retrieve from the parsed form.
+Here's a minimal AJAX example to get you moving:
+templates/index.html:
+<!DOCTYPE html>
+<html lang="en">
+<head>
+ <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
+</head>
+<body>
+ <h1>Welcome!</h1>
+ <p>Type to search the database.</p>
+ <form>
+ <input id="search-term">
+ <input type="submit" value="search">
+ </form>
+ <div id="result"></div>
+
+<script>
+$("form").submit(function (event) {
+ event.preventDefault();
+ var data = $("#search-term").val();
+ $.ajax({
+ type: "POST",
+ url: "search",
+ data: {data},
+ success: data => $("#result").text(data),
+ error: res => console.error(res),
+ });
+});
+</script>
+</body>
+</html>
+
+app.py:
+from flask import Flask, render_template, request
+
+app = Flask(__name__)
+
+@app.route("/search", methods=["POST"])
+def search():
+ data = request.form.get("data")
+ print(data)
+ return data
+
+@app.route("/")
+def index():
+ return render_template("index.html")
+
+if __name__ == "__main__":
+ app.run(host="127.0.0.1", port=8080, debug=True)
+
+
+If you want to submit and render a new page, you don't need jQuery:
+templates/index.html:
+<!DOCTYPE html>
+<html lang="en">
+<body>
+ <h1>Welcome!</h1>
+ <p>Type to search the database.</p>
+ <form action="/" method="post">
+ <input name="search-term">
+ <input type="submit" value="search">
+ </form>
+ <div>{{search_term}}</div>
+</body>
+</html>
+
+app.py:
+from flask import Flask, render_template, request
+
+app = Flask(__name__)
+
+@app.route("/", methods=["GET", "POST"])
+def index():
+ search_term = request.form.get("search-term", "")
+ return render_template("index.html", search_term=search_term)
+
+if __name__ == "__main__":
+ app.run(host="127.0.0.1", port=8080, debug=True)
+
+Or point to another page:
+templates/index.html:
+<!DOCTYPE html>
+<html lang="en">
+<body>
+ <h1>Welcome!</h1>
+ <p>Type to search the database.</p>
+ <form action="/search" method="post">
+ <input name="search-term">
+ <input type="submit" value="search">
+ </form>
+</body>
+</html>
+
+templates/search.html:
+<!DOCTYPE html>
+<html lang="en">
+<body>
+ <p>You searched: '{{search_term}}'</p>
+</body>
+</html>
+
+app.py:
+from flask import Flask, render_template, request
+
+app = Flask(__name__)
+
+@app.route("/search", methods=["POST"])
+def search():
+ search_term = request.form.get("search-term", "")
+ return render_template("search.html", search_term=search_term)
+
+@app.route("/")
+def index():
+ return render_template("index.html")
+
+if __name__ == "__main__":
+ app.run(host="127.0.0.1", port=8080, debug=True)
+
",python
+"Ptython - how to write '\\n' to a fileI need to write \\n into a file.
+The problem I have is I get only \n
+def read_file(input_path):
+ f = open(input_path, "r")
+ read_lines = f.readlines()
+ read_lines_length = len(read_lines)
+ r = 0
+ while r < read_lines_length:
+ read_lines[r]= read_lines[r].replace('\n','')
+ r+=1
+ f.close()
+
+ for element in range(len(read_lines)):
+ read_lines[element].replace('\n', '\\n').replace('\r', '\\r')
+
+ return read_lines
+
+With this I am able to store \\n in a List but when I use f.write() it only prints \n. Normally would you just make: prin(\\\\n) to get \\n but when I make .replace("\n", "\\\\n") it stays \\n in the List
+['print("Enter: \'stop\' --> exit function.\\n"']
+
+The Output:
+read = read_file(input_path)
+for element in range(len(read)):
+ print(read[element])
+
+---> print("Enter: 'stop' --> exit function.\n"
+
+What I want to accomplish is a file which will create automatic from my normal Python file an encrypted python file.
+Here some files to work with:
+Core.py
+packager.py
+what I want to have
+And here the folder Structure: imgur
","I think your problem is just the different view of text in the lists by print function. In fact the \n in list are really \n not \\n. Please pay attention to the following example. The \n is used between Hello and World, after print(a), two words appear in different lines. In b variable a is inserted in the list ([]); now print(b) show \n.
+a = "Hello\nWorld"
+print(a)
+b = [a]
+print(b)
+print(b[0])
+
+The output:
+Hello
+World
+['Hello\nWorld']
+Hello
+World
+
+Edit based on question changes:
+Changing read_file as the following code can help:
+def read_file(input_path):
+ f = open(input_path, "r")
+ read_lines = f.readlines()
+ read_lines = [r.replace("\\n",'\\\\n').replace('\n','') for r in read_lines]
+ f.close()
+ return read_lines
+
",python
+"Django view to download a file from serverI have a views.py that:
+
+- creates some .xlsx files
+- select the correct .zip and place the file inside
+
+After that, I want this .zip to be automatically downloaded. I did some research and tested some codes but none worked.
+I created a "temp" folder in the root of the app where the created files are stored.
+simplified view.py
+def generate_ws(request,cource,ca_id):
+ ca = get_object_or_404(CreditAnalysis,pk=ca_id)
+ ca_owners = CAOwner.objects.filter(ca_operation=ca)
+ mo_farms = MOFarm.objects.filter(ca_operation=ca)
+ misses = []
+
+ generate_owner_mo(ca_owner,misses,city)
+ zip_name = 'temp/MOs - ' + str(ca_owner.owner) + '.zip'
+ zf = zipfile.ZipFile(zip_name,'w')
+ zf.close()
+
+ generate_farm_mo(mo_farm,misses,city)
+ generate_production_mo(ca,misses,city,production_city,pks)
+
+ files = glob.glob('temp/*.xlsx') #SELECT FILES AND PUT IN .ZIP
+ for file in files:
+ file_key = file.split('.')[0]
+ file_key=file_key.split(' - ')
+ for ca_owner in ca_owners:
+ zip_name = 'temp/MOs - ' + str(ca_owner.owner) + '.zip'
+ if str(ca_owner.owner) in file_key:
+ zf = zipfile.ZipFile(zip_name,'a')
+ new_file_name = file[5:]
+ zf.write(file,new_file_name)
+ zf.close()
+ break
+ files = glob.glob('temp/*.zip') # GET .ZIP FILES
+ for file in files:
+ download_mo(request,file) # CREATE A DOWNLOAD FOR EACH .ZIP FILE
+
+ misses = list(set(misses))
+
+ return render(request,'generate_mo.html',{'misses':misses,})
+
+download_mo
+def download_mo(request,file):
+ path_to_file = os.path.realpath(file)
+ with open(path_to_file,'rb') as fh:
+ response = HttpResponse(fh.read())
+ file_name = file[5:] #WITHDRAW "temp/"
+ response['Content-Disposition'] = 'inline; filename=' + file_name
+ return response
+
+Everything works correctly except the download which never starts
","In order to download a file, you need to return a FileResponse to the user. However, calling an external function that returns a FileResponse won't work because you're not actually returning the FileResponse to the user, in your case the user only receives the render(request, 'generate_mo.html', {'misses':misses,}) so that won't download the files.
+You can't download several files one after the others, so I suggest putting them all in a .zip or .tar file so that you can download them as only one file, and only need to return one FileResponse.
+As you also need to render your template, what you can do is redirect to your download_mo view on template loading so that your file is downloaded while your template is rendered.
+Now, for your download_mo view, just replace your HttpResponse with a FileResponse :
+from django.http import FileResponse
+def download_mo(request,file):
+ path_to_file = os.path.realpath(file)
+ response = FileResponse(open(path_to_file, 'rb'))
+ file_name = file[5:]
+ response['Content-Disposition'] = 'inline; filename=' + file_name
+ return response
+
",python
+"How can I change the source of background image in Kivy?Here is how i set the background:
+Builder.load_string('''
+<AppInterface>:
+ orientation: 'vertical'
+ canvas.before:
+ Rectangle:
+ id: backg # im not sure if i can set an id for this
+ pos: self.pos
+ size: self.size
+ source: 'assets/background.jpg'
+#rest of the code goes here
+''')
+
+and then later i want to change the source of the image, when the user presses the button, but I don't know how to do that. I tried a few stuff like
+self.ids.backg.source = 'assets/background2.jpg'
+
+but this didn't work. I am using FloatLayout. Any idea how can I do this?
","Referancing by StringProperty()
+.py side, in AppInterface class:
+background_source = StringProperty('assets/background.jpg')
+
+.kv side:
+source: root.background_source
+
+Now, if you want to change that image, just set new path like: (in .py side)
+background_source = 'assets/background2.jpg'
+
",python
+"Concatenate different values that belong to the same groupI have a data frame like this
+
+import pandas as pd
+
+
+data = [
+ ['ACOT', '00001', '', '', 1.5, 20, 30, 'AA'],
+ ['ACOT', '00002', '', '', 1.7, 20, 33,'BB'],
+ ['ACOT', '00003', '','NA_0001' ,1.4, 20, 40,'AA'],
+ ['PAN', '000090', 'canonical', '', 0.5, 10, 30,'DD'],
+ ['PAN', '000091', '', '', 0.4, 10, 30,'CC'],
+ ['TOM', '000080', 'canonical', '', 0.4, 10, 15,'EE'],
+ ['TOM', '000040', '', '', 1.7, 10, 300,'EE']
+]
+
+df = pd.DataFrame(data, columns=[
+ 'Gene_name', 'Transcript_ID', 'canonical', 'mane', 'metrics','start','end', 'Example_extra_col'])
+
+
+
+Gene_name Transcript_ID canonical mane metrics start end Example_extra_col
+0 ACOT 00001 1.5 20 30 AA
+1 ACOT 00002 NA_0001 1.7 20 33 BB
+2 ACOT 00003 1.4 20 40 AA
+3 PAN 000090 canonical NA_00090 0.5 10 30 DD
+4 PAN 000091 0.4 10 30 CC
+5 TOM 000080 canonical 0.4 10 15 EE
+6 TOM 000040 1.7 10 300 EE
+
+
+And I am reducing row trying not loosing data with these lines
+out = (df
+ .groupby('Gene_name', as_index=False)
+ .agg({'canonical': 'any',
+ 'mane': 'any',
+ 'metrics': lambda x: f'{x.min()}-{x.max()}',
+ 'Example_extra_col': 'first', # Here is the one I want to change
+ })
+ .replace({True: 'Yes', False: 'No'})
+)
+
+However, for the last column, I want to concatenate data if values belonging to the dame group are different
+ Gene_name canonical mane metrics Example_extra_col
+0 ACOT No Yes 1.4-1.7 AA,BB
+1 PAN Yes No 0.4-0.5 DD,CC
+2 TOM Yes No 0.4-1.7 EE
+
+How can I do this with .gg ?
","Try this:
+# Custom aggregation function
+func = lambda x: ", ".join([y for y in x.fillna("").unique() if y])
+
+(
+ df.groupby("Gene_name", as_index=False)
+ .agg(
+ {
+ "canonical": "any",
+ "mane": "any",
+ "metrics": lambda x: f"{x.min()}-{x.max()}",
+ "Example_extra_col": func,
+ }
+ )
+ .replace({True: "Yes", False: "No"})
+)
+
+(Edited to support NANs in the column)
",python
+"I Want to get 2 types of Events, that are, Past Events and Future Events from my Events Model in a Debating Society Website made using DjangoI have made a Website for the Debating Society of our College using Django.
+I would like to have 2 types of events Past and Upcoming (Future) according to the date_of_competition, i.e., if the date and time of competition is past current date and time then return it in past events, and if the date and time of competition is in future of the current date and time then return it in future events
+Here are my views.py file and models.py file for events
+models.py:
+from django.db import models
+
+
+class Format(models.Model):
+ format_name = models.CharField(max_length=100, null=False, unique=True)
+
+ def __str__(self):
+ return self.format_name
+
+
+class Organiser(models.Model):
+ organiser_name = models.CharField(max_length=140, null=False, unique=True)
+
+ def __str__(self):
+ return self.organiser_name
+
+
+class Event(models.Model):
+ banner_image = models.ImageField(upload_to="events")
+ event_name = models.CharField(max_length=150, null=False)
+ organiser_of_event = models.ForeignKey(Organiser, on_delete=models.CASCADE)
+ format_of_event = models.ForeignKey(Format, on_delete=models.CASCADE)
+ date_of_event = models.DateTimeField(auto_now_add=False)
+ registration_fees = models.IntegerField(default=0, help_text="Enter Registration Fees For The Event in Rupees")
+ details = models.TextField(null=True, blank=True)
+ created_at = models.DateTimeField(auto_now_add=True)
+ updated_at = models.DateTimeField(auto_now=True)
+
+ def __str__(self):
+ return self.event_name
+
+view.py:
+from django.shortcuts import render
+from .models import Event
+
+
+# Create your views here.
+def events(request):
+ context = {
+ 'title': 'Events',
+ 'events': Event.objects.all()
+ }
+ return render(request, 'main/Events.html', context)
+
+
+What logic should be written in order to get both future and past events from my events table?
+(If there's anything you don't understand or need something extra, feel free to ask for it).
","The obvious way is to just do two queries, using the __gte (greater than or equal) and __lt (less than) operators.
+from django.utils import timezone
+
+# ...
+
+now = timezone.now()
+
+context = {
+ 'title': 'Events',
+ 'future_events': Event.objects.filter(date_of_event__gte=now),
+ 'past_events': Event.objects.filter(date_of_event__lt=now),
+}
+
+You could also do a single query and do the filtering in Python:
+now = timezone.now()
+all_events = Event.objects.all()
+future_events = [e for e in all_events if e.date_of_event >= now]
+past_events = [e for e in all_events if e.date_of_event < now]
+
+context = {
+ 'title': 'Events',
+ 'future_events': future_events,
+ 'past_events': past_events,
+}
+
",python
+"Python/OpenCV: Save pictures with defined path and namewould like to save an image in a defined path and with a defined name.
+The path is fixed, but the name from a variable.
+With the name alone is no problem as you see above but how can i input the fixed path?
+import cv2
+
+img_name = "TEST"
+
+cam = cv2.VideoCapture(0)
+
+
+cam.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
+cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
+
+while(True):
+
+ ret, frame = cam.read()
+ cv2.imshow('preview',frame)
+
+ if cv2.waitKey(1) & 0xFF == ord('q'):
+ img_name = "{}.jpg".format(img_name)
+ cv2.imwrite(img_name, frame)
+ break
+
+cam.release()
+cv2.destroyAllWindows()
+
+Thanks and best regards :)
","import os
+import cv2
+
+img_name = "TEST"
+
+cam = cv2.VideoCapture(0)
+
+
+cam.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
+cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
+
+while(True):
+
+ ret, frame = cam.read()
+ cv2.imshow('preview',frame)
+
+ if cv2.waitKey(1) & 0xFF == ord('q'):
+ img_name = os.path.join("your_path_here", "{}.jpg".format(img_name))
+ cv2.imwrite(img_name, frame)
+ break
+
+cam.release()
+cv2.destroyAllWindows()
+
",python
+"PyGithub - Can't set attribute error while trying to change default branchI've written this code to change the default branch from "master" to "release".
+from github import Github
+g = Github("github token", verify=False, base_url="url to repo")
+
+repo = g.get_repo("repo name")
+repo.default_branch = 'release'
+
+I am getting the following error.
+ repo.default_branch = 'release'
+AttributeError: can't set attribute
+
+I am the admin of that repository and I created the branch. I don't think this is an access issue. What am I doing incorrectly?
","The default_branch attribute is a read-only attribute; if you want to change the default branch you need to use the edit method:
+repo.edit(default_branch='release')
+
",python
+"Getting Value from div with class having another blank tag Python BeautifulSoupI am trying to get the value '5' of from the following :
+<div class="DrugPriceBox__price___dj2lv">₹<!-- -->5</div>
+
+and the python-beautifulsoup code i used returns Nothing:
+drugprice=soup.find('div', class_="DrugPriceBox__price___dj2lv")
+
+print(drugprice)
+
+The webpage url is: https://www.1mg.com/drugs/acticort-5mg-tablet-321932
+Thank you in advance!
+Additional information:
+The WORKING CODE after solving the problem:
+
+
+
if __name__ == '__main__':
+ #turl='https://www.1mg.com/drugs/acticort-5mg-tablet-321932'
+ turl='https://www.1mg.com/drugs/zerodol-sp-tablet-67307'
+ print (turl)
+ soup = BeautifulSoup(requests.get(turl,headers=headers).content, ""html.parser"")
+ #type 'div' pricing format
+ div = soup.find('div', class_='DrugPriceBox__price___dj2lv')
+ if div:
+ print(div.text)
+ else:
+ #type 'span' pricing format
+ #span =soup.find('span', class_=""PriceBoxPlanOption__offer-price___3v9x8 PriceBoxPlanOption__offer-price-cp___2QPU_"")
+ span =soup.find('span', class_=""PriceBoxPlanOption__margin-right-4___2aqFt PriceBoxPlanOption__stike___pDQVN"")
+ if span:
+ print(span.text)
+ else:
+ print('Nada')
+
+
+","Not sure why you are getting an error. I cannot see the original site as it is blocked for some reason, but running an express server with exactly the div that you entered, and using the below worked fine for me with the below.
+import string
+
+import bs4
+import requests
+
+if __name__ == '__main__':
+ r = requests.get('http://localhost:3000/')
+ soup = bs4.BeautifulSoup(r.text)
+ div = soup.find('div', class_='DrugPriceBox__price___dj2lv')
+ acceptable_chars = set(string.ascii_letters + string.digits + '.')
+ drugprice = ''.join(char for char in div.text if char in acceptable_chars)
+ print(drugprice)
+
",python
+"Machine Learning Model Only Predicting Mode in Data SetI am trying to do sentiment analysis for text. I have 909 phrases commonly used in emails, and I scored them out of ten for how angry they are, when isolated.
Now, I upload this .csv file to a Jupyter Notebook, where I import the following modules:
+import numpy as np
+import pandas as pd
+from sklearn.model_selection import train_test_split
+from sklearn.naive_bayes import MultinomialNB
+from sklearn.feature_extraction.text import TfidfVectorizer
+
+Now, I define both columns as 'phrases' and 'anger':
+df=pd.read_csv('Book14.csv', names=['Phrase', 'Anger'])
+df_x = df['Phrase']
+df_y = df['Anger']
+
+Subsequently, I split this data such that 20% is used for testing and 80% is used for training:
+x_train, x_test, y_train, y_test = train_test_split(df_x, df_y, test_size=0.2, random_state=4)
+
+Now, I convert the words in x_train to numerical data using TfidfVectorizer:
+tfidfvectorizer = TfidfVectorizer(analyzer='word', stop_words='en')
+x_traincv = tfidfvectorizer.fit_transform(x_train.astype('U'))
+
+Now, I convert x_traincv to an array:
+a = x_traincv.toarray()
+
+I also convert x_testcv to a numerical array:
+x_testcv=tfidfvectorizer.fit_transform(x_test)
+x_testcv = x_testcv.toarray()
+
+Now, I have
+mnb = MultinomialNB()
+b=np.array(y_test)
+error_score = 0
+b=np.array(y_test)
+for i in range(len(x_test)):
+ mnb.fit(x_testcv,y_test)
+ testmessage=x_test.iloc[i]
+ predictions = mnb.predict(x_testcv[i].reshape(1,-1))
+ error_score = error_score + (predictions-int(b[i]))**2
+ print(testmessage)
+ print(predictions)
+print(error_score/len(x_test))
+
+However, an example of the results I get are:
+
+Bring it back
+[0]
+It is greatly appreciatd when
+[0]
+Apologies in advance
+[0]
+Can you please
+[0]
+See you then
+[0]
+I hope this email finds you well.
+[0]
+Thanks in advance
+[0]
+I am sorry to inform
+[0]
+You’re absolutely right
+[0]
+I am deeply regretful
+[0]
+Shoot me through
+[0]
+I’m looking forward to
+[0]
+As I already stated
+[0]
+Hello
+[0]
+We expect all students
+[0]
+If it’s not too late
+[0]
+
+and this repeats on a large scale, even for phrases that are obviously very angry. When I removed all data containing a '0' from the .csv file, the now modal value (a 10) is the only prediction for my sentences.
Why is this happening? Is it some weird way to minimise error? Are there any inherent flaws in my code? Should I take a different approach?","Two things, you are fitting The MultinomialNB with the test set. In your loop you have mnb.fit(x_testcv,y_test) but you should do mnb.fit(x_traincv,y_train)
+Second, when performing pre-processing you should call the fit_transform only on the training data while on the test you should call only the transform method.
",python
+"Translate array into x and y direction - PythonWe have the following two-dimensional array with x and y coordinates:
+x = np.array([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]])
+
+We flatten it: x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) )
+and our goal is to apply translations into x direction, y direction.
+We are dealing with a 4x4 array (lattice), and the first transformation is 1 shift into x direction :
+so from '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]' we get '[1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12]'.
+The next transformation is two shifts in x:
+from '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]' we get '[2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13]'.
+We want to get this (flattened) array:
+y = np.array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
+ [1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12],
+ [2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13],
+ [3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14],
+ [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3],
+ [5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0],
+ [6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1],
+ [7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2],
+ [8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7],
+ [9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4],
+ [10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5],
+ [11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6],
+ [12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
+ [13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8],
+ [14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9],
+ [15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10]])
+
+I tried using:
+y = np.roll(np.roll(x, -1), -1)
+
","Can concatenate two vstack operations. First, roll in axis=1 and then, roll in axis=0.
+np.vstack([np.roll(np.roll(arr, -i, axis=0), -x, axis=1).flatten() \
+ for x in range(arr.shape[0])] \
+ for i in range(arr.shape[1]))
+
+
+array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
+ [ 1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12],
+ [ 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13],
+ [ 3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14],
+ [ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3],
+ [ 5, 6, 7, 4, 9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0],
+ [ 6, 7, 4, 5, 10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1],
+ [ 7, 4, 5, 6, 11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2],
+ [ 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7],
+ [ 9, 10, 11, 8, 13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4],
+ [10, 11, 8, 9, 14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5],
+ [11, 8, 9, 10, 15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6],
+ [12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
+ [13, 14, 15, 12, 1, 2, 3, 0, 5, 6, 7, 4, 9, 10, 11, 8],
+ [14, 15, 12, 13, 2, 3, 0, 1, 6, 7, 4, 5, 10, 11, 8, 9],
+ [15, 12, 13, 14, 3, 0, 1, 2, 7, 4, 5, 6, 11, 8, 9, 10]])
+
",python
+"How to mask only the most recent date?Let's say I have 2 days worth of date times:
+import pandas as pd
+import numpy as np
+
+index = pd.date_range("2020-01-01 00:00:00", "2020-01-03 00:00:00", freq="15T")
+
+print(index)
+DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 00:15:00',
+ '2020-01-01 00:30:00', '2020-01-01 00:45:00',
+ '2020-01-01 01:00:00', '2020-01-01 01:15:00',
+ '2020-01-01 01:30:00', '2020-01-01 01:45:00',
+ '2020-01-01 02:00:00', '2020-01-01 02:15:00',
+ ...
+ '2020-01-02 21:45:00', '2020-01-02 22:00:00',
+ '2020-01-02 22:15:00', '2020-01-02 22:30:00',
+ '2020-01-02 22:45:00', '2020-01-02 23:00:00',
+ '2020-01-02 23:15:00', '2020-01-02 23:30:00',
+ '2020-01-02 23:45:00', '2020-01-03 00:00:00'],
+ dtype='datetime64[ns]', length=193, freq='15T')
+
+And then I declare a pandas array based on my index with each element set to False:
+entries = pd.Series(False, index=index)
+
+I then get the market times, and the open times:
+market_hours_indices = entries.index.indexer_between_time('9:30', '16:00')
+market_hours_index = entries.index[market_hours_indices]
+open_hours_indices = entries.index.indexer_between_time('9:30', '10:30')
+open_hours_index = entries.index[open_hours_indices]
+
+And then mask my initial pandas array with my indexs:
+entries = entries[market_hours_index]
+entries[~entries.index.isin(open_hours_index)] = False
+entries
+
+This gives me:
+2020-01-01 09:30:00 True
+2020-01-01 09:45:00 True
+2020-01-01 10:00:00 True
+2020-01-01 10:15:00 True
+2020-01-01 10:30:00 True
+2020-01-01 10:45:00 False
+2020-01-01 11:00:00 False
+2020-01-01 11:15:00 False
+2020-01-01 11:30:00 False
+2020-01-01 11:45:00 False
+2020-01-01 12:00:00 False
+2020-01-01 12:15:00 False
+2020-01-01 12:30:00 False
+2020-01-01 12:45:00 False
+2020-01-01 13:00:00 False
+2020-01-01 13:15:00 False
+2020-01-01 13:30:00 False
+2020-01-01 13:45:00 False
+2020-01-01 14:00:00 False
+2020-01-01 14:15:00 False
+2020-01-01 14:30:00 False
+2020-01-01 14:45:00 False
+2020-01-01 15:00:00 False
+2020-01-01 15:15:00 False
+2020-01-01 15:30:00 False
+2020-01-01 15:45:00 False
+2020-01-01 16:00:00 False
+2020-01-02 09:30:00 True
+2020-01-02 09:45:00 True
+2020-01-02 10:00:00 True
+2020-01-02 10:15:00 True
+2020-01-02 10:30:00 True
+2020-01-02 10:45:00 False
+2020-01-02 11:00:00 False
+2020-01-02 11:15:00 False
+2020-01-02 11:30:00 False
+2020-01-02 11:45:00 False
+2020-01-02 12:00:00 False
+2020-01-02 12:15:00 False
+2020-01-02 12:30:00 False
+2020-01-02 12:45:00 False
+2020-01-02 13:00:00 False
+2020-01-02 13:15:00 False
+2020-01-02 13:30:00 False
+2020-01-02 13:45:00 False
+2020-01-02 14:00:00 False
+2020-01-02 14:15:00 False
+2020-01-02 14:30:00 False
+2020-01-02 14:45:00 False
+2020-01-02 15:00:00 False
+2020-01-02 15:15:00 False
+2020-01-02 15:30:00 False
+2020-01-02 15:45:00 False
+2020-01-02 16:00:00 False
+dtype: bool
+
+Which is almost what I'm aiming to do, but how do I only target the second day? So that my array looks like:
+2020-01-01 09:30:00 False
+2020-01-01 09:45:00 False
+2020-01-01 10:00:00 False
+2020-01-01 10:15:00 False
+2020-01-01 10:30:00 False
+2020-01-01 10:45:00 False
+2020-01-01 11:00:00 False
+2020-01-01 11:15:00 False
+2020-01-01 11:30:00 False
+2020-01-01 11:45:00 False
+2020-01-01 12:00:00 False
+2020-01-01 12:15:00 False
+2020-01-01 12:30:00 False
+2020-01-01 12:45:00 False
+2020-01-01 13:00:00 False
+2020-01-01 13:15:00 False
+2020-01-01 13:30:00 False
+2020-01-01 13:45:00 False
+2020-01-01 14:00:00 False
+2020-01-01 14:15:00 False
+2020-01-01 14:30:00 False
+2020-01-01 14:45:00 False
+2020-01-01 15:00:00 False
+2020-01-01 15:15:00 False
+2020-01-01 15:30:00 False
+2020-01-01 15:45:00 False
+2020-01-01 16:00:00 False
+2020-01-02 09:30:00 True
+2020-01-02 09:45:00 True
+2020-01-02 10:00:00 True
+2020-01-02 10:15:00 True
+2020-01-02 10:30:00 True
+2020-01-02 10:45:00 False
+2020-01-02 11:00:00 False
+2020-01-02 11:15:00 False
+2020-01-02 11:30:00 False
+2020-01-02 11:45:00 False
+2020-01-02 12:00:00 False
+2020-01-02 12:15:00 False
+2020-01-02 12:30:00 False
+2020-01-02 12:45:00 False
+2020-01-02 13:00:00 False
+2020-01-02 13:15:00 False
+2020-01-02 13:30:00 False
+2020-01-02 13:45:00 False
+2020-01-02 14:00:00 False
+2020-01-02 14:15:00 False
+2020-01-02 14:30:00 False
+2020-01-02 14:45:00 False
+2020-01-02 15:00:00 False
+2020-01-02 15:15:00 False
+2020-01-02 15:30:00 False
+2020-01-02 15:45:00 False
+2020-01-02 16:00:00 False
+dtype: bool
+
+What I've tried:
+# get the days I'm interested in
+df_all_days = df[(df['date'] >= previous_day) & (df['date'] <= date)]
+
+# get the day I want to trade
+df_current_day = df[df['date'] == date]
+current_day_index = df_current_day.index
+# create an empty numpy array
+entries = pd.Series(True, index=current_day_index)
+
+# get the index of the current day between trading hours
+market_hours_indices = entries.index.indexer_between_time('9:30', '16:00')
+market_hours_index = entries.index[market_hours_indices]
+
+# return False any dates and times which fall outside of our date
+df_all_days['enter'] = df_all_days['enter'][~df_all_days['enter'].index.isin(market_hours_index)] = False
+
+This returns (everything false):
+2022-06-15 07:16:00 False
+2022-06-15 09:17:00 False
+2022-06-15 09:18:00 False
+2022-06-15 09:19:00 False
+...
+2022-06-16 19:59:00 False
+Name: enter, dtype: bool
+
","I figured it out:
+df = pd.read_csv(fname, index_col='datetime', usecols=["open", "high", "low", "close", "volume", "datetime", "date", "time"], parse_dates=True)
+
+# get the days I'm interested in
+df_all_days = df[(df['date'] >= previous_day) & (df['date'] <= date)]
+
+# get the day I want to trade
+df_current_day = df_all_days[df_all_days['date'] == date]
+# get the hours I want to trade
+df_current_day_indices = df_current_day.index.indexer_between_time('09:30', '16:00')
+current_day_index = df_current_day.index[df_current_day_indices]
+
+# create fake entry logic
+df_all_days['entries'] = np.where((0), True, False)
+# only enter during the days and hours we wanna trade
+df_all_days['entries'] = df_all_days['entries'].index.isin(current_day_index)
+
+with pd.option_context('display.max_rows', None, 'display.max_columns', None): # more options can be specified also
+ # both arrays are the same size
+ print(df_all_days['entries'].size)
+ print(df_all_days['close'].size)
+ print(df_all_days['entries'])
+
",python
+"Python, how to execute a line of code without it stopping the rest of the code from executing?first of all, im a beginner.
+Want i want to accomplish is that music plays while the script is executing.
+What it does right now it plays the music, waits until the music is over and then executes the rest of the code. That is not what i want. Here my Code:
+import os
+import subprocess
+import multiprocessing
+import threading
+from playsound import playsound
+CurrentPath = os.path.dirname(os.path.normpath(__file__))
+os.chdir(CurrentPath)
+
+def music():
+ Music = "Music.mp4"
+ #subprocess.run(["ffplay", "-nodisp", "-autoexit", "-hide_banner", Music])
+ playsound("Music.mp4")
+
+def other_things():
+ print("Hello World")
+
+#musicp = multiprocessing.Process(target=music())
+#restp = multiprocessing.Process(target=other_things())
+musicp = threading.Thread(target=music())
+restp = threading.Thread(target=other_things())
+
+restp.start()
+musicp.start()
+
+LIke you can see i even tried multithreading but it still waits until the music is over before it goes to the rest of the code.
","Don't call the functions in the target parameter of the Thread function - delete the brackets to reference the function, not its return value
+musicp = threading.Thread(target=music) # instead of music()
+restp = threading.Thread(target=other_things) # instead of other_things()
+
",python
+"Call compound.finance api with parametersI'm trying to simply call the compound.finance api "https://api.compound.finance/api/v2/account" with the parameter max_health. the doc says
+
+"If provided, should be given as { "value": "...string formatted
+number..." }".
+
+(https://compound.finance/docs/api#account-service)
+So I tried 4 methods here below:
+response = requests.get(
+ 'https://api.compound.finance/api/v2/account',
+ params={
+ "max_health": "1.0" # method 1
+ "max_health": {"value":"1.0"} # method 2
+ "max_health": json.dumps({"value":"1.0"}) # method 3
+ }
+)
+
+but it does not work, and I get
+
+HTTPError: 500 Server Error: Internal Server Error for url:...
+
+Any idea I should format it please?
","They did not update the API docs. You should send a POST request and provide params as a request body.
+import json
+import requests
+
+url = "https://api.compound.finance/api/v2/account"
+data = {
+ "max_health": {"value": "1.0"}
+}
+
+response = requests.post(url, data=json.dumps(data)) # <Response [200]>
+response = response.json() # {'accounts': ...}
+
+Edit notes
+The problem was that the API expects raw JSON so I used json.dumps.
",python
+"Format string to datetime in Pandas without zero padding, AM/PM and UTCMy csv file contains the date and time values like below. I want to convert this to a datetime format in pandas but I keep getting errors.
+I then want to convert it to a 24 hour format with zero padding and without the mention UTC.
+I already tried : format="%m/%d/%Y %I:%M %p"
+10/12/2021 10:25 AM UTC
+9/28/2021 8:51 AM UTC
+7/27/2021 9:45 AM UTC
+2/2/2022 7:10 PM UTC
+
+Desired output:
+10/12/2021 10:25
+09/28/2021 08:51
+07/27/2021 09:45
+02/02/2022 19:10
+
","You can use %m/%d/%Y %H:%M:
+pd.to_datetime(df['date'], dayfirst=False).dt.strftime('%m/%d/%Y %H:%M')
+
+output:
+0 10/12/2021 10:25
+1 09/28/2021 08:51
+2 07/27/2021 09:45
+3 02/02/2022 19:10
+Name: date, dtype: object
+
+used input:
+df = pd.DataFrame({'date': ['10/12/2021 10:25 AM UTC',
+ '9/28/2021 8:51 AM UTC',
+ '7/27/2021 9:45 AM UTC',
+ '2/2/2022 7:10 PM UTC']})
+
",python
+"Pivot values in specific orderI have a dataframe where I would like to pivot my data to fit a specific format, making sure the dates are consecutive.
+Data
+ID Q122_c_en Q122con_s Q222_c_en Q222con_s Q322_c_en Q322con_s Q422_c_en Q422con_s
+AA 900 89 1000 90 1200 92 1000 90
+BB 1000 10 1000 20 1100 25 1300 30
+
+
+Desired
+ID Date con_en con_s
+AA Q122 900 89
+AA Q222 1000 90
+AA Q322 1200 92
+AA Q422 1000 90
+BB Q122 1000 10
+BB Q222 1000 20
+BB Q322 1100 25
+BB Q422 1300 30
+
+Doing
+df.pivot(index="ID", columns="Date", values=["con_en", "con_s"])
+
+I am using pivot, however, the format does not reflect the desired above format.
","one option, where you can do the transform in one step, is with pivot_longer from pyjanitor:
+# pip install pyjanitor
+import pandas as pd
+import janitor
+
+df.pivot_longer(
+ index = 'ID',
+ names_to = ('Date', '.value'),
+ names_pattern = r"(Q\d+)_?(.+)",
+ sort_by_appearance = True)
+
+ ID Date c_en con_s
+0 AA Q122 900 89
+1 AA Q222 1000 90
+2 AA Q322 1200 92
+3 AA Q422 1000 90
+4 BB Q122 1000 10
+5 BB Q222 1000 20
+6 BB Q322 1100 25
+7 BB Q422 1300 30
+
+Any label in the columns associated with .value stays as a header; this is determined by the groups in the regex in names_pattern.
",python
+"Pass task results from preceding tasks celeryHere is my code. I want to pass the task myname result pass to be the task reverse in the signature as an argument.
+Here is my code. I want to pass the task myname result pass to be the task reverse in the signature as an argument.
+from app import app
+from app import app
+from time import sleep
+from celery.utils.log import get_task_logger
+import os
+from celery import signature, chain, group, chord
+from celery.result import allow_join_result
+
+
+MyQUEUE = os.getenv("SCANS_QUEUE")
+logger = get_task_logger(__name__)
+
+@app.task(queue=MyQUEUE, ignore_result=True)
+def reverse(text):
+ logger.info('reverse order '.format(text))
+ return {"reversename": str(text[::-1])}
+
+@app.task(queue=MyQUEUE, ignore_result=True)
+def add(a,b):
+ logger.info('Addition --> a : {0} & b : {1} '.format(a,b))
+ return {"addition": str(a+b)}
+
+@app.task(queue=MyQUEUE, ignore_result=True)
+def myname(a):
+ logger.info('Name --> a : {0}'.format(a))
+ return {"name": str(a)}
+
+
+@app.task(queue=MyQUEUE, ignore_result=True)
+def run_pipeline(a,b,n):
+ resultchain = chain([
+ group([
+ signature(
+ add,
+ args=(a,b),
+ queue=MyQUEUE
+ ),
+ signature(
+ myname,
+ args=(n),
+ queue=MyQUEUE
+ )
+ ]),
+ signature
+ (
+ reverse,
+ args=(-------),
+ queue=MyQUEUE
+ )
+ ]).apply_async()
+
+ with allow_join_result():
+ results = resultchain.join()
+ return results
+
","First and most important, if you are gonna use chain, group, starmap or another kind of task workflow, tasks which results will be used on the future need to be set with ignore_result=False or omit the argument (default value is False). Needed to store the value, at least on myname and add
+@app.task(queue=MyQUEUE)
+def add(a,b):
+ logger.info('Addition --> a : {0} & b : {1} '.format(a,b))
+ return {"addition": str(a+b)}
+
+@app.task(queue=MyQUEUE)
+def myname(a):
+ logger.info('Name --> a : {0}'.format(a))
+ return {"name": str(a)}
+
+
+Now, for reverse to obtain the results in the group of add and myname, you need to adjust reverse to handle the group result (a list of the results).
+For a chain the results of a task will be used as the first argument of the next task, in this case the group results will be injected in the first value of the reverse task as [{'addition': ...}, {'name': ...}], with that you can access the correct value.
+@app.task(queue=MyQUEUE)
+def reverse(group_data):
+ # group_data value: [{'addition': '3'}, {'name': 'VALUE'}]
+ text = group_data[1]['name']
+ logger.info('reverse order '.format(text))
+ return {"reversename": str(text[::-1])}
+
+Finally if you only want to reverse the result of myname, you have to chain only myname and reverse.
+resultchain = chain([
+ signature(myname, args=(n,)),
+ signature(reverse)
+]).apply_async()
+
",python
+"create volume in docker compose base on client directory pathI want to use a local folder for example dataset in my docker container. Also I have a python request module in which client send to API server a directory path. I want to use this path in docker compose as volume to use it in docker container.
+How to send this path to docker compose.
+API request is :
+def train():
+ url = 'http://127.0.0.1:8000/train' # server url
+ data = {'label': ['fire'],
+ 'image_path': '/path/to/directory',
+ 'label_path': '/path/to/directory',
+ 'image_size': 640,
+ 'validation_split': 0.2,
+ 'save_dir': 'results/'}
+ r = requests.post(url, json=data)
+ print(r.text)
+
+how to send local path 'image_path' and 'label_path'to docker compose and use it as volumes.
+my docker compose :
+version: "3.8"
+services:
+ inference:
+ container_name: "inference"
+ build: "./backbone"
+ ports:
+ - '5000:5000'
+ volumes:
+ - ./backbone:/code
+ # - ./volumes/weights:/weights
+ command: "python3 app.py"
+ environment:
+ - PORT=5000
+ - MODEL_PATH=/code/weights/best.pt
+ ipc: host
+ shm_size: 1024M
+
+ train:
+ container_name: "train"
+ build: "./train"
+ shm_size: '2gb'
+ command: "python3 main_train.py"
+ environment:
+ - RESPONSE_URL=http://127.0.0.1:8000/response
+ - LOGGER_URL=http://127.0.0.1:8000/logger
+ - PORT=8000
+ - IS_LOGGER_ON=False
+ ports:
+ - '8000:8000'
+ volumes:
+ - ./train:/code
+ ipc: host
+
","You need a couple of changes to make this work successfully.
+In the server code, you need to not accept a full path. There are a number of security concerns around doing this (can you retrieve the application code? the file of database credentials? system files like /etc/passwd?). Instead, set it to have a configurable data path, and only accept files within that path.
+DATA_PATH = os.environ.get('DATA_PATH', 'data') # default relative to current directory
+
+def handle(data):
+ if '/' in data['image_file']:
+ raise InvalidInputError()
+ image_path = os.path.join(DATA_PATH, data['image_file'])
+ ...
+
+When you set this up in a Docker image, you can specify a fixed path for that data path. Using a simple path in a subdirectory of the container filesystem root is fine.
+# train/Dockerfile
+FROM python
+...
+ENV DATA_DIR=/data # set this variable only because the application wants it
+RUN mkdir -p "$DATA_DIR" # create an empty directory by default
+CMD ["./main_train.py"]
+
+Now when you launch this in Compose, you know the (fixed) container-side path that will hold the data directory, so you can mount content there.
+services:
+ train:
+ build: ./train
+ ports:
+ - '8000:8000'
+ volumes:
+ - ./dataset_3:/data
+
+It's usually the case that details like the filesystem layout in an image, paths for injected data or configuration, and the port that the service inside the container uses are fixed properties of the image. So it's safe to specify "the data will always be in /data" and use that as a mount point; you do not need to specify it as a variable anywhere in the Docker setup. Similarly, you should not need to set a $PORT environment variable in your Compose setup since the container-side port number will generally be a fixed property of your image.
",python
+"SMTP_HELO returns timeout when running email address validationUsing library py3-validate-email-1.0.5 (more here) to check if email address is valid, including SMTP check, I wasn't able to make it through check_smtp step, because I get following error:
+Python script
+from validate_email import validate_email
+from validate_email import validate_email_or_fail
+from csv import DictReader
+
+# iterate over each line by column name
+with open('email-list.csv', 'r') as read_obj:
+ csv_dict_reader = DictReader(read_obj, delimiter=';')
+ for row in csv_dict_reader:
+ i = 1
+ while i < 21:
+ header_name = 'Email'+str(i)
+ if validate_email_or_fail(
+ email_address=row[header_name],
+ check_format=True,
+ check_blacklist=True,
+ check_dns=True,
+ dns_timeout=10,
+ check_smtp=True,
+ smtp_timeout=5,
+ smtp_helo_host='emailsrv.domain.com',
+ smtp_from_address='email@domain.com',
+ smtp_skip_tls=False,
+ smtp_tls_context=None,
+ smtp_debug=False):
+ print('Email ' + row[header_name] + ' is valid.')
+ else:
+ print('Email ' + row[header_name] + ' is invalid.')
+ i += 1
+
+Error:
+Traceback (most recent call last):
+ File "//./main.py", line 13, in <module>
+ if validate_email_or_fail(
+ File "/usr/local/lib/python3.9/site-packages/validate_email/validate_email.py", line 59, in validate_email_or_fail
+ return smtp_check(
+ File "/usr/local/lib/python3.9/site-packages/validate_email/smtp_check.py", line 229, in smtp_check
+ return smtp_checker.check(hosts=mx_records)
+ File "/usr/local/lib/python3.9/site-packages/validate_email/smtp_check.py", line 197, in check
+ raise SMTPTemporaryError(error_messages=self.__temporary_errors)
+validate_email.exceptions.SMTPTemporaryError: Temporary error in email address verification:
+mx.server.com: 451 timed out (in reply to 'connect')
+
+I figured there is problem with my DNS settings (probably), so I dockerized the script and run it on AWS EC2, where I have used elastic IP, attached it to the EC2 instance where docker container is running, I also setup reverse DNS for domain emailsrv.domain.com with this elastic IP. Tried to run the script, no change.
+Then I added MX record pointing to the emailsrv.domain.com, but still no change. The DNS records are setup properly, because I have checked it with multiple DNS tools available.
+Since the library doesn't require to actually use my email address login details, I wonder what can be the problem? Just to be sure, the email address used in the script doesn't exist, since I don't have smtp server setup on that instance, obviously.
+Any ideas?
","Reason behind this was closed port on AWS EC2 instance. Opening the port in security group is not enough, you must send a request to AWS so they remove the restriction on port 25.
+When they did that, works flawlessly.
",python
+"How to get Data from MySQL DB on Python Flask?I´m learning Python Flask with MySQL Workbench, I have a database 'books' and a table 'books.books_tb' in my Workbench. I made a simple Flask app with the tutorial which looks like that:
+from flask import Flask
+from flask_mysqldb import MySQL
+
+@app.route('/books')
+def home():
+ def GetBookLink():
+ mydb = mysql.connector.connect(
+ host='localhost',
+ user = 'root',
+ database = 'books'
+ )
+ mycursor = mydb.mycursor()
+
+
+
+ mycursor.execute("SELECT * FROM books.books_tb")
+ DBData = mycursor.fetchall()
+ mycursor.close()
+ return DBData
+
+ DBData = GetBookLink()
+ return render_template("index.html", ScrapedBookData = DBData)
+
+
+my index.html looks like that:
+{% extends "base.html"%} {% block title %}Home{% endblock %}{% block content %}
+<h1>{{ScrapedBookData}}</h1>
+{% endblock %}
+
+But I get an error (link at bottom) and as a newbie I don´t understand how to solve it. How can I solve this problem and write some queries to display the rows in my database?
+[enter image description here][1]
+[1]: https://i.stack.imgur.com/lZTGQ.png
","From the first page of the documentation, you're missing the instantiation of the MySQL class into a mysql variable.
+from flask import Flask
+from flask_mysqldb import MySQL
+
+app = Flask(__name__)
+mysql = MySQL(app) # this is the instantiation
+
+
+@app.route('/')
+def users():
+ cur = mysql.connection.cursor()
+ cur.execute('''SELECT user, host FROM mysql.user''')
+ rv = cur.fetchall()
+ return str(rv)
+
+if __name__ == '__main__':
+ app.run(debug=True)
+
",python
+"Trying to open .bson file and read to pandas df but getting 'bson.errors.InvalidBSON: objsize too large' first time using .bson#This is my code
+import pandas as pd
+import bson
+
+FILE="users_(1).bson"
+
+with open(FILE,'rb') as f:
+ data = bson.decode_all(f.read())
+
+main_df=pd.DataFrame(data)
+main_df.describe()
+
+#This is my .bson file
+[{'_id': ObjectId('999f24f260f653401b'),
+ 'isV2': False,
+ 'isBeingMigratedToV2': False,
+ 'firstName': 'Jezz',
+ 'lastName': 'Bezos',
+ 'subscription': {'_id': ObjectId('999f24f260f653401b'),
+ 'chargebeeId': 'AzZdd6T847kHQ',
+ 'currencyCode': 'EUR',
+ 'customerId': 'AzZdd6T847kHQ',
+ 'nextBillingAt': datetime.datetime(2022, 7, 7, 10, 14, 6),
+ 'numberOfMonthsPaid': 1,
+ 'planId': 'booster-v3-eur',
+ 'startedAt': datetime.datetime(2022, 6, 7, 10, 14, 6),
+ 'addons': [],
+ 'campaign': None,
+ 'maskedCardNumber': '************1234'},
+ 'email': 'jeffbezos@gmail.com',
+ 'groupName': None,
+ 'username': 'jeffbezy',
+ 'country': 'DE'},
+ {'_id': ObjectId('999f242660f653401b'),
+ 'isV2': False,
+ 'isBeingMigratedToV2': False,
+ 'firstName': 'Caterina',
+ 'lastName': 'Fake',
+ 'subscription': {'_id': ObjectId('999f242660f653401b'),
+ 'chargebeeId': '16CGLYT846t99',
+ 'currencyCode': 'GBP',
+ 'customerId': '16CGLYT846t99',
+ 'nextBillingAt': datetime.datetime(2022, 7, 7, 10, 10, 41),
+ 'numberOfMonthsPaid': 1,
+ 'planId': 'personal-v3-gbp',
+ 'startedAt': datetime.datetime(2022, 6, 7, 10, 10, 41),
+ 'addons': [],
+ 'campaign': None,
+ 'maskedCardNumber': '************4311'},
+ 'email': 'caty.fake@gmail.com',
+ 'groupName': None,
+ 'username': 'cfake',
+ 'country': 'GB'}]
+
+I get the error
+'bson.errors.InvalidBSON: objsize too large'
+
+Is it something to do with the datetime? Is it the structure of the .bson file, been at this for hours and can't seem to see the error. I know how to work with json and tried to convert it to json but no success. Any tips would be appreciated.
","If the main goal here is to read the data into a pandas DataFrame you could indeed format the data to json and use bson.json_util.loads:
+import pandas as pd
+from bson.json_util import loads
+
+with open(filepath,'r') as f:
+ data = f.read()
+
+mapper = {
+ '\'': '"', # using double quotes
+ 'False': 'false',
+ 'None': '\"None\"', # double quotes around None
+ # modifying the ObjectIds and timestamps
+ '("': '(',
+ '")': ')',
+ ')': ')"',
+ 'ObjectId': '"ObjectId',
+ 'datetime.datetime': '"datetime.datetime'
+}
+for k, v in mapper.items():
+ data = data.replace(k, v)
+
+data = loads(data)
+df = pd.DataFrame(data)
+
",python
+"Flatten a nested xml with pandasI have the follow xml in this format:
+<?xml version='1.0' encoding='UTF-8'?>
+<ettevotjad>
+ <ettevotja>
+ <nimi>000 Holdings OÜ</nimi>
+ <ariregistri_kood>16372442</ariregistri_kood>
+ <ettevotja_oiguslik_vorm>Osaühing</ettevotja_oiguslik_vorm>
+ <ettevotja_oigusliku_vormi_alaliik/>
+ <kmkr_nr/>
+ <ettevotja_staatus>R</ettevotja_staatus>
+ <ettevotja_staatus_tekstina>Registrisse kantud</ettevotja_staatus_tekstina>
+ <ettevotja_esmakande_kpv>23.11.2021</ettevotja_esmakande_kpv>
+ <ettevotja_aadress>
+ <asukoht_ettevotja_aadressis/>
+ <asukoha_ehak_kood/>
+ <asukoha_ehak_tekstina></asukoha_ehak_tekstina>
+ <indeks_ettevotja_aadressis/>
+ <ads_adr_id></ads_adr_id>
+ <ads_ads_oid></ads_ads_oid>
+ <ads_normaliseeritud_taisaadress/>
+ </ettevotja_aadress>
+ <teabesysteemi_link>https://ariregister.rik.ee/est/company/16372442</teabesysteemi_link>
+ </ettevotja>
+ <ettevotja>
+ <nimi>001 group OÜ</nimi>
+ <ariregistri_kood>12754230</ariregistri_kood>
+ <ettevotja_oiguslik_vorm>Osaühing</ettevotja_oiguslik_vorm>
+ <ettevotja_oigusliku_vormi_alaliik/>
+ <kmkr_nr/>
+ <ettevotja_staatus>R</ettevotja_staatus>
+ <ettevotja_staatus_tekstina>Registrisse kantud</ettevotja_staatus_tekstina>
+ <ettevotja_esmakande_kpv>17.11.2014</ettevotja_esmakande_kpv>
+ <ettevotja_aadress>
+ <asukoht_ettevotja_aadressis>Õismäe tee 78-9</asukoht_ettevotja_aadressis>
+ <asukoha_ehak_kood>0176</asukoha_ehak_kood>
+ <asukoha_ehak_tekstina>Haabersti linnaosa, Tallinn, Harju maakond</asukoha_ehak_tekstina>
+ <indeks_ettevotja_aadressis>13513</indeks_ettevotja_aadressis>
+ <ads_adr_id>2182337</ads_adr_id>
+ <ads_ads_oid></ads_ads_oid>
+ <ads_normaliseeritud_taisaadress>Harju maakond, Tallinn, Haabersti linnaosa, Õismäe tee 78-9</ads_normaliseeritud_taisaadress>
+ </ettevotja_aadress>
+ <teabesysteemi_link>https://ariregister.rik.ee/est/company/12754230</teabesysteemi_link>
+ </ettevotja>
+</ettevotjad>
+
+Using pandas' .read_xml() yields:
+ import pandas as pd
+
+ data = pd.read_xml('test_file.xml')
+
+ print(data.head(2).to_string())
+ nimi ariregistri_kood ettevotja_oiguslik_vorm ettevotja_oigusliku_vormi_alaliik kmkr_nr ettevotja_staatus ettevotja_staatus_tekstina ettevotja_esmakande_kpv ettevotja_aadress teabesysteemi_link
+0 000 Holdings OÜ 16372442 Osaühing NaN None R Registrisse kantud 23.11.2021 NaN https://ariregister.rik.ee/est/company/16372442
+1 001 group OÜ 12754230 Osaühing NaN None R Registrisse kantud 17.11.2014 NaN https://ariregister.rik.ee/est/company/12754230
+
+Notice in the dataframe 'ettevotja_aadress' is NaN, but in fact if you look at the xml structure, it's nested with those sub columns/headers. How do I expand out those nested columns into the dataframe?
+I thought one way to do it was to simply read in the file, remove the <ettevotja_aadress> and <ettevotja_aadress> tags, then read into pandas, but it seems like there should be direct way to do this, similar to pandas' .json_normalize().
","This could be done by providing an XSL stylesheet to flatten the original XML. Code will look like following:
+xml = '''<?xml version='1.0' encoding='UTF-8'?>
+<ettevotjad>
+ <ettevotja>
+ <nimi>000 Holdings OÜ</nimi>
+ <ariregistri_kood>16372442</ariregistri_kood>
+ <ettevotja_oiguslik_vorm>Osaühing</ettevotja_oiguslik_vorm>
+ <ettevotja_oigusliku_vormi_alaliik/>
+ <kmkr_nr/>
+ <ettevotja_staatus>R</ettevotja_staatus>
+ <ettevotja_staatus_tekstina>Registrisse kantud</ettevotja_staatus_tekstina>
+ <ettevotja_esmakande_kpv>23.11.2021</ettevotja_esmakande_kpv>
+ <ettevotja_aadress>
+ <asukoht_ettevotja_aadressis/>
+ <asukoha_ehak_kood/>
+ <asukoha_ehak_tekstina></asukoha_ehak_tekstina>
+ <indeks_ettevotja_aadressis/>
+ <ads_adr_id></ads_adr_id>
+ <ads_ads_oid></ads_ads_oid>
+ <ads_normaliseeritud_taisaadress/>
+ </ettevotja_aadress>
+ <teabesysteemi_link>https://ariregister.rik.ee/est/company/16372442</teabesysteemi_link>
+ </ettevotja>
+ <ettevotja>
+ <nimi>001 group OÜ</nimi>
+ <ariregistri_kood>12754230</ariregistri_kood>
+ <ettevotja_oiguslik_vorm>Osaühing</ettevotja_oiguslik_vorm>
+ <ettevotja_oigusliku_vormi_alaliik/>
+ <kmkr_nr/>
+ <ettevotja_staatus>R</ettevotja_staatus>
+ <ettevotja_staatus_tekstina>Registrisse kantud</ettevotja_staatus_tekstina>
+ <ettevotja_esmakande_kpv>17.11.2014</ettevotja_esmakande_kpv>
+ <ettevotja_aadress>
+ <asukoht_ettevotja_aadressis>Õismäe tee 78-9</asukoht_ettevotja_aadressis>
+ <asukoha_ehak_kood>0176</asukoha_ehak_kood>
+ <asukoha_ehak_tekstina>Haabersti linnaosa, Tallinn, Harju maakond</asukoha_ehak_tekstina>
+ <indeks_ettevotja_aadressis>13513</indeks_ettevotja_aadressis>
+ <ads_adr_id>2182337</ads_adr_id>
+ <ads_ads_oid></ads_ads_oid>
+ <ads_normaliseeritud_taisaadress>Harju maakond, Tallinn, Haabersti linnaosa, Õismäe tee 78-9</ads_normaliseeritud_taisaadress>
+ </ettevotja_aadress>
+ <teabesysteemi_link>https://ariregister.rik.ee/est/company/12754230</teabesysteemi_link>
+ </ettevotja>
+</ettevotjad>
+'''
+
+import pandas as pd
+
+stylesheet = '''<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
+<xsl:output indent="yes"/>
+<xsl:template match="/">
+ <ettevotjad>
+ <xsl:apply-templates select="//ettevotja"/>
+ </ettevotjad>
+</xsl:template>
+<xsl:template match="ettevotja">
+ <ettevotja>
+ <xsl:copy-of select="node()[not(self::ettevotja_aadress)]"/>
+ <xsl:apply-templates select="./ettevotja_aadress"/>
+ </ettevotja>
+ </xsl:template>
+ <xsl:template match="ettevotja_aadress">
+ <xsl:copy-of select="node()"/>
+ </xsl:template>
+ </xsl:stylesheet>
+'''
+
+df = pd.read_xml(xml, xpath="//ettevotja", stylesheet = stylesheet)
+
+df.head()
+
+The idea behind the code is that XSL transformation is applied to the XML document, flattening its structure. The result of the transformation will be:
+<?xml version="1.0" encoding="UTF-8"?>
+<ettevotjad>
+ <ettevotja>
+ <nimi>000 Holdings OÜ</nimi>
+ <ariregistri_kood>16372442</ariregistri_kood>
+ <ettevotja_oiguslik_vorm>Osaühing</ettevotja_oiguslik_vorm>
+ <ettevotja_oigusliku_vormi_alaliik/>
+ <kmkr_nr/>
+ <ettevotja_staatus>R</ettevotja_staatus>
+ <ettevotja_staatus_tekstina>Registrisse kantud</ettevotja_staatus_tekstina>
+ <ettevotja_esmakande_kpv>23.11.2021</ettevotja_esmakande_kpv>
+ <teabesysteemi_link>https://ariregister.rik.ee/est/company/16372442</teabesysteemi_link>
+
+ <asukoht_ettevotja_aadressis/>
+ <asukoha_ehak_kood/>
+ <asukoha_ehak_tekstina/>
+ <indeks_ettevotja_aadressis/>
+ <ads_adr_id/>
+ <ads_ads_oid/>
+ <ads_normaliseeritud_taisaadress/>
+ </ettevotja>
+ <ettevotja>
+ <nimi>001 group OÜ</nimi>
+ <ariregistri_kood>12754230</ariregistri_kood>
+ <ettevotja_oiguslik_vorm>Osaühing</ettevotja_oiguslik_vorm>
+ <ettevotja_oigusliku_vormi_alaliik/>
+ <kmkr_nr/>
+ <ettevotja_staatus>R</ettevotja_staatus>
+ <ettevotja_staatus_tekstina>Registrisse kantud</ettevotja_staatus_tekstina>
+ <ettevotja_esmakande_kpv>17.11.2014</ettevotja_esmakande_kpv>
+ <teabesysteemi_link>https://ariregister.rik.ee/est/company/12754230</teabesysteemi_link>
+
+ <asukoht_ettevotja_aadressis>Õismäe tee 78-9</asukoht_ettevotja_aadressis>
+ <asukoha_ehak_kood>0176</asukoha_ehak_kood>
+ <asukoha_ehak_tekstina>Haabersti linnaosa, Tallinn, Harju maakond</asukoha_ehak_tekstina>
+ <indeks_ettevotja_aadressis>13513</indeks_ettevotja_aadressis>
+ <ads_adr_id>2182337</ads_adr_id>
+ <ads_ads_oid/>
+ <ads_normaliseeritud_taisaadress>Harju maakond, Tallinn, Haabersti linnaosa, Õismäe tee 78-9</ads_normaliseeritud_taisaadress>
+ </ettevotja>
+</ettevotjad>
+
+After the transformation an XPath //ettevotja is applied to the transformation result, taking children of ettevotja elements as dataframe row fields.
",python
+"FunctionTransformer & creating new columns in pipelineI have a sample data:
+df = pd.DataFrame(columns=['X1', 'X2', 'X3'], data=[
+ [1,16,9],
+ [4,36,16],
+ [1,16,9],
+ [2,9,8],
+ [3,36,15],
+ [2,49,16],
+ [4,25,14],
+ [5,36,17]])
+
+I want to create two complementary columns in my df based on x2 ad X3 and include it in the pipeline.
+I am trying to follow the code:
+def feat_comp(x):
+ x1 = 100-x
+ return x1
+
+pipe_text = Pipeline([('col_test', FunctionTransformer(feat_comp, 'X2',validate=False))])
+X = pipe_text.fit_transform(df)
+
+It gives me an error:
+TypeError: 'str' object is not callable
+
+How can I apply the function transformer on selected columns and how can I use them in the pipeline?
","If I understand you correctly, you want to add a new column based on a given column, e.g. X2. You need to pass this column as an additional argument to the function using kw_args:
+import pandas as pd
+from sklearn.preprocessing import FunctionTransformer
+from sklearn.pipeline import Pipeline
+
+df = pd.DataFrame(columns=['X1', 'X2', 'X3'], data=[
+ [1,16,9],
+ [4,36,16],
+ [1,16,9],
+ [2,9,8],
+ [3,36,15],
+ [2,49,16],
+ [4,25,14],
+ [5,36,17]])
+
+def feat_comp(x, column):
+ x[f'100-{column}'] = 100 - x[column]
+ return x
+
+pipe_text = Pipeline([('col_test', FunctionTransformer(feat_comp, validate=False, kw_args={'column': 'X2'}))])
+pipe_text.fit_transform(df)
+
+Result:
+ X1 X2 X3 100-X2
+0 1 16 9 84
+1 4 36 16 64
+2 1 16 9 84
+3 2 9 8 91
+4 3 36 15 64
+5 2 49 16 51
+6 4 25 14 75
+7 5 36 17 64
+
+(in your example FunctionTransformer(feat_comp, 'X2',validate=False) X2 would be the inverse_func and the string X2 is not callalble, hence the error)
",python
+"PyTorch CNN Different Input SizeHello Guys I have a question about different Input Sizes.
+My training set and validation dataset have an input Size of 256 and for my prediction (with an unseen Test Dataset) I have an input size of 496.
+class Net(nn.Module):
+ def __init__(self, shape):
+ super(Net,self).__init__()
+ self.conv1 = nn.Conv1d(shape,1,1)
+ self.batch1 = nn.BatchNorm1d(1)
+ self.avgpl1 = nn.AvgPool1d(1, stride=1)
+ self.fc1 = nn.Linear(1,3)
+
+ #forward method
+ def forward(self,x):
+ x = self.conv1(x)
+ x = self.batch1(x)
+ x = F.relu(x)
+ x = self.avgpl1(x)
+ x = torch.flatten(x,1)
+ x = F.log_softmax(self.fc1(x))
+ return x
+
+I saved the model and wanna use it also for my prediction.
+Error Message is:
+Input In [244], in predict_data(prediction_data, model_path, data_config, context)
+ 25 new_model = Net(shape_preprocessed_data)
+ 26 # load the previously saved state_dict
+---> 27 new_model.load_state_dict(torch.load("NetModel.pth"))
+ 29 # check if predictions of models are equal
+ 30
+ 31 # generate random input of size (N,C,H,W)
+ 32
+ 33 # switch to eval mode for both models
+ 34 model = model.eval()
+
+ RuntimeError: Error(s) in loading state_dict for Net:
+ size mismatch for conv1.weight: copying a param with shape
+ torch.Size([1, 256, 1]) from checkpoint, the shape in current model is torch.Size([1, 494, 1]).
+
+How can I solve this?
+
","You could reshape/downsample the input as the first step of the forward pass in your model. This can be done using the torch.nn.functional.interpolate function.
+For example:
+class Net(nn.Module):
+def __init__(self, shape):
+ super(Net,self).__init__()
+ self.input_shape = shape
+ self.conv1 = nn.Conv1d(shape,1,1)
+ self.batch1 = nn.BatchNorm1d(1)
+ self.avgpl1 = nn.AvgPool1d(1, stride=1)
+ self.fc1 = nn.Linear(1,3)
+
+#forward method
+def forward(self,x):
+ x = torch.nn.functional.interpolate(x, size=self.input_shape)
+ x = self.conv1(x)
+ x = self.batch1(x)
+ x = F.relu(x)
+ x = self.avgpl1(x)
+ x = torch.flatten(x,1)
+ x = F.log_softmax(self.fc1(x))
+ return x
+
+Your test images would then be downsampled to size 256 in order to be compatible with the model.
",python
+"Multiple filter data dropdown table with plotly in pythonI am currently trying to generate a table with multiple dropdown options (in a Jupiter Notebook). I have been able to create the scenario but the dropdown buttons work independently, meaning that if I select one dropdown from my 'Decision' dropdown with option 'A' and then select the other downtown 'Sex' with option 'Female' I either get option A or option 2 depending on the button last selected. However, what I really want is to have both filters applied at the same time. So I would get all Decisions 'A' that are 'Female' as filters for my table.
+This is my code:
+fig = go.Figure(go.Table(header={"values": df_dash.columns,'fill_color':'navy','align':'left',
+ 'font':dict(color='white', size=12)},
+ cells={"values": df_dash.T.values,'fill_color':'white','align':'left'}))
+fig.update_layout(
+ updatemenus=[
+ {
+ "y": 1 - (i / 5),
+ "buttons": [
+ {
+ "label": c,
+ "method": "restyle",
+ "args": [
+ {
+ "cells": {
+ "values": df_dash.T.values
+ if c == "All"
+ else df_dash.loc[df_dash[menu].eq(c)].T.values
+ }
+ }
+ ],
+ }
+ for c in ["All"] + df_dash[menu].unique().tolist()
+
+ ],
+
+ }
+ for i, menu in enumerate(["Decision", "Sex",])
+ ]
+)
+
+And this is how it looks:
+![]()
+Any ideas what am I doing wrong or what can I add to make both buttons actively filtering my df table at the same time?
","Based on this answer in the plotly forums, you cannot create buttons that are dependent on each other.
+The best you could probably do is create one dropdown with every possible combination in your current Decision and Sex dropdowns – which I understand probably isn't ideal.
+If you are willing to use plotly-dash, then it is possible as you can create two dropdowns that are inputs to a basic callback that updates the fig, as shown in this example in the plotly-dash documentation
",python
+"Not able to access child window using pywinautoControl identifiers
+![]()
+I am trying to automate the process to decrypt the file using kleopatra application using pywinauto library. I'm able to connect and the access the element from the main window, but not able to get hold of the new window that pops up.
+I want to access the new windows element and open a file from that window whose title is "Select One or More Files to Decrypt and/or Verify".
+Here is what I'm doing in the code, but getting the error.
+import pywinauto
+app = pywinauto.application.Application(backend='uia').start(r"C:\Program Files (x86)\Gpg4win\bin\kleopatra.exe").connect(title="kleopatra",timeout=100)
+
+fileMenu = app.kleopatra.child_window(title="File", auto_id="MainWindow#1.file.file", control_type="MenuItem").wrapper_object()
+
+fileMenu.click_input()
+
+# app.kleopatra.print_control_identifiers()
+
+decryptMenu = app.kleopatra.child_window(title="Decrypt/Verify...", auto_id="KXMLGUIClient-KActionCollection.file_decrypt_verify_files", control_type="MenuItem")
+
+decryptMenu.click_input()
+
+selectFileWindow = pywinauto.application.Application(backend='uia').connect(title = "Select One or More Files to Decrypt and/or Verify")
+
+
+
+How can I connect to new window?
+Error
+![]()
","Try this -
+selectFileWindow = app.window(title = "Select One or More Files to Decrypt and/or Verify")
",python
+"How to insert variable length list into stringI have what I think is a basic question in Python:
+I have a list that can be variable in length and I need to insert it into a string for later use.
+Formatting is simple, I just need a comma between each name up to nameN and parenthesis surrounding the names.
+List = ['name1', 'name2' .... 'nameN']
+string = "Their Names are <(name1 ... nameN)> and they like candy.
+
+Example:
+List = ['tom', 'jerry', 'katie']
+print(string)
+Their Names are (tom, jerry, katie) and they like candy.
+
+Any ideas on this? Thanks for the help!
","# Create a comma-separated string with names
+the_names = ', '.join(List) # 'tom, jerry, katie'
+
+# Interpolate it into the "main" string
+string = f"Their Names are ({the_names}) and they like candy."
+
",python
+"How to return a big number in a plain format?I want to return a big number in a plain format but instead I'm getting this:
+>>> x=1000000000000000000000000000000
+>>> int(1*x/100) # try to get 1% of x
+9999999999999999583119736832 # <------------ here
+
+I was expecting 10000000000000000000000000000, how do I achieve this?
","You can use decimal to handle the decimal point more precisely.
+It provides fast correctly-rounded decimal floating point arithmetic. In contrast, numbers like 1.1 and 2.2 do not have exact representations in binary floating point. Decimal numbers can be represented exactly and in decimal floating point.
+from decimal import Decimal
+x = Decimal(1000000000000000000000000000000)
+print(int(1*x/100)) # 10000000000000000000000000000
+
",python
+"List Comprehension for List of Dictionary to get Values Separately for Each KeyI want to get the city names and their respective population in separate list from a given list of dictionary. I have achieved this using naive method and using map() function as well but I need it to be executed using List Comprehension technique. I have tried below code but it is not giving proper output. What modifications should I do, please comment. Thanks.
+towns = [{'name': 'Manchester', 'population': 58241},
+ {'name': 'Coventry', 'population': 12435},
+ {'name': 'South Windsor', 'population': 25709}]
+
+print('Name of towns in the city are:', [item for item in towns[item]['name'].values()])
+print('Population of each town in the city are:', [item for item in towns[item]['population'].values()])
+
+** Expected Output **
+Name of towns in the city are: ['Manchester', 'Coventry', 'South Windsor']
+Population of each town in the city are: [58241, 12435, 25709]
","try this :
+towns = [{'name': 'Manchester', 'population': 58241},
+ {'name': 'Coventry', 'population': 12435},
+ {'name': 'South Windsor', 'population': 25709}]
+
+print('Name of towns in the city are:',
+ [town['name'] for town in towns])
+print('Population of each town in the city are:',
+ [town['population'] for town in towns])
+
+output:
+Name of towns in the city are: ['Manchester', 'Coventry', 'South Windsor']
+Population of each town in the city are: [58241, 12435, 25709]
+
",python
+"Find counts of duplicates in list of setsHi i have to check the overlapping and duplicate of string from the data , i could do it can anyone help me to find the duplicate of string .I have this data s = [(100, 350,"a"), (125, 145,"a"), (200, 400, "d"), (0, 10, "a")] and i done the overlap part but theduplicate check odf string i need help .
+def overlap(a, b) -> bool:
+ a_start, a_end, _ = a
+ b_start, b_end, _ = b
+ return a_start < b_end and b_start < a_end
+ls = [(100, 350,"a"), (125, 145,"a"), (200, 400, "d"), (0, 10, "a")]
+overlaps = set()
+for idx_a in range(len(ls)):
+ for idx_b in range(len(ls)):
+ if idx_a != idx_b:
+ if overlap(ls[idx_a], ls[idx_b]):
+ overlaps.add(ls[idx_a])
+ overlaps.add(ls[idx_b])
+
+print(f"Number of overlaps: {len(overlaps)}")
+
","It seems like you don't need to use set in case you need only number of overlaps.
+I would solve your problem like this:
+def is_overlapped(a, b) -> bool: # changed the name for readability
+ a_start, a_end, _ = a
+ b_start, b_end, _ = b
+ return a_start < b_end and b_start < a_end
+ls = [(100, 350,"a"), (125, 145,"a"), (200, 400, "d"), (0, 10, "a")]
+overlaps = 0 # int instead of set
+for idx_a, value_a in enumerate(ls): # this is more pythonic way to access the index and item at the same time
+ for idx_b, value_b in enumerate(ls):
+ if idx_a != idx_b:
+ if is_overlapped(value_a, value_b):
+ overlaps += 1
+print(f"Number of overlaps: {overlaps}")
+
",python
+"How to reshape a (x, y) numpy array into a (x, y, 1) array?How do you reshape a (55, 11) numpy array to a (55, 11, 1) numpy array?
+Attempts:
+
+- Simply doing
numpy_array.reshape(-1, 1) without any loop produces a flat array that is not 3D.
+- The following
for loop produces a "cannot
+broadcast error":
+
+for i in range(len(numpy_array)):
+ numpy_array[i] = numpy_array[i].reshape(-1, 1)
+
","Maybe you are looking for numpy.expand_dims(https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html)?
+import numpy
+a = numpy.random.rand(55,11)
+print(a.shape) # 55,11
+print(numpy.expand_dims(a, 2).shape) # 55, 11, 1
+
",python
+"How to get my messages to show on Django for a registration form?I am creating a registration form using the Django framework and I want to display some error messages to the user if they enter the wrong confirm password, or an email already taken etc. I have written the code and it seems to be working, but I can't seem to get the messages to show on screen upon redirecting to back to registration page if there is an error in the form. I have imported messages on the views.py page (from django.contrib import messages) and I think my setting.py is all configured correct: setting.py
+Here is my views.py code:
+def register(request):
+if request.method == "GET":
+ register_form = RegisterForm()
+ return render(request, "main/register.html", {
+ 'form': register_form
+ })
+else:
+ register_form = RegisterForm(request.POST)
+ if register_form.is_valid():
+ first_name = request.POST['first_name']
+ last_name = request.POST['last_name']
+ username = request.POST['username']
+ email = request.POST['email']
+ password = request.POST['password']
+ confirm_password = request.POST['confirm_password']
+
+
+ if password == confirm_password:
+ if User.objects.filter(email=email).exists():
+ messages.info(request, 'Email or user name Already taking')
+ return redirect('register')
+ elif User.objects.filter(username=username).exists():
+ messages.info(request, 'username is taken')
+ return redirect('register')
+ else:
+ User.objects.get_or_create(username=username,
+ first_name=first_name, last_name=last_name, email=email,
+ password=password)
+
+ return redirect('main/login.html')
+ else:
+ messages.error(request, 'Password Not Match')
+ return redirect('register')
+ #return redirect ('/')
+ else:
+ return render(request, 'main/login.html')
+
+and this is my register.html form:
+ <form action="{% url 'register' %}" method="POST">
+ {% csrf_token %}
+ <fieldset>
+ <legend>Enter details</legend>
+ <ul>
+ {{ form.as_table }}
+
+ <button type="submit" class="mybutton _f-purple" value="submit">Register</button>
+ </ul>
+ </fieldset>
+ </form>
+
","I think this is what you need.
+ {% if messages %}
+ <div class="mgs-area">
+ <div class="mgs-item">
+ {% for message in messages %}
+ <span {% if message.tags %} class="nav_item message-{{ message.tags }} " {% endif %}> {{ message }} </span>
+ {% endfor %}
+ </div>
+ </div>
+ {% endif %}
+
+You can style success or error messages differently.
+
+class="nav_item message-{{ message.tags }}
+
",python
+"How to find the most common name in 2 related listsI would like to seek help from the community.. I have 2 related lists here:
+names = ['alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady']
+votes = [True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True]
+
+The votes list is a result of a facial recognition algorithm matching from the corresponding names list. Then I shall link each True vote to the corresponding name, and find the most frequently occurred name to be the final 'winner'.
+I have tried 2 ways:
+characters = {}
+for name, vote in list(zip(names, votes)):
+ if vote == True:
+ characters[name] = characters.get(name, 0) + 1
+#print(characters)
+print(max(characters, key=characters.get))
+
+The output is 'owen_grady'
+from collections import Counter
+
+characters = [name for name, vote in list(zip(names, votes)) if vote == True]
+#print(characters)
+print(Counter(characters).most_common()[0][0])
+
+The output is also 'owen_grady'. Which way is more efficient: Dictionary? or List Comprehension with Counter?
+My ultimate question: is there another way (the most efficient) to get the result? I would like the output to be just 'owen_grady'
","You can use itertools.compress() to filter all false entries. Option with Counter should be most efficient, just use n argument in .most_common() to let it return a single pair.
+Code:
+from itertools import compress
+from collections import Counter
+
+names = ['alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady']
+votes = [True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True]
+
+most_common = Counter(compress(names, votes)).most_common(1)[0][0]
+# Or with some syntax sugar:
+# [(most_common, _)] = Counter(compress(names, votes)).most_common(1)
+
+
+Upd. I've made some benchmarks and it seems like for this particular case slightly optimized first method demonstrates better performance:
+from itertools import compress
+
+names = ['alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady']
+votes = [True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True]
+
+characters = list(compress(names, votes))
+most_common = max(set(characters), key=characters.count)
+
+
+You can help my country, check my profile info.
",python
+"Solidity ""require"" error message in PythonI am having trouble displaying the error message from the smart contract's require method in my Python console. I am using the web3.py module to interact with the contract.
+Here is the smart contract code:
+function burn(uint258 amount) public returns(bool){
+ require(balance[msg.sender]>=amount, "Not enough balance to burn");
+ ...
+}
+
+Calling this method trying to trigger the require restriction does nothing in the console. Checking on etherscan the transaction is not completed (which is the desired behaviour). I found some solutions online about Remix, but I want the error message to be displayed in my python code.
+Is the error message encrypted in some response like the hash of the transaction, or is there another way to get the message ?
","To get the revert reason of a transaction, you can use fetch_transaction_revert_reason function from web3-ethereum-defi library.
",python
+"Runtime variable for Oracle DATE type in AirflowI'm trying to define task that requires a run-time parameter (let's call it 'batch_dt') in Apache Airflow.
+I'm using OracleStoredProcedureOperator and the parameter of the procedure is of database type date.
+procedure use_dates (
+ i_date in date,
+ i_date2 in date,
+ i_date3 in date
+ );
+
+However I'm having hard time defining this parameter as runtime variable. I could use an exactly formatted string for particular database but don't want to depend on current NLS setting in the database.
+
+Airflow macros don't work {{ dag_run.conf['batch_dt'] }} or even
{{ macros.datetime.strptime(dag_run.conf['batch_dt'], '%Y-%m-%d') }}
+returns always a string resulting in
+ORA-01861: literal does not match format string
+
+Using to_date('{{ dag_run.conf['batch_dt'] }}', 'DD-MM-YYYY') results in
+ORA-01858: a non-numeric character was found where a numeric was expected
+
+When I define date.today() right in the task, it works fine, however I need to use run-time variable.
+
+
+res_task = OracleStoredProcedureOperator(
+ task_id = 'mytask',
+ procedure = 'use_dates',
+ parameters =
+ {"i_date": date.today(), #works but is not runtime
+ "i_date2": "to_date('{{ dag_run.conf['batch_dt'] }}', 'DD-MM-YYYY')",
+ #returns a string "to_date('13-06-2022', 'DD-MM-YYYY')" which results in ORA-01858: a non-numeric character was found where a numeric was expected
+ "i_date3": "{{ macros.datetime.strptime(dag_run.conf['batch_dt'], '%Y-%m-%d' ) }}"
+ #returns a string '2022-06-13 00:00:00' which results in ORA-01861: literal does not match format string
+ }
+ )
+
+I was thinking about macro that returns a datetime object at runtime, however it seems macros can return only strings. Any idea how this can be achieved?
","Jinja templating, used in Apache Airflow, renders template fields as strings by default. Therefore the result of your strptime method call is converted into string when rendered.
+To override this behavior and be able to render native Python objects instead of plain strings, you have to set DAG's render_template_as_native_obj option to True (default option's value is False).
+with DAG(
+ 'USE_DATES',
+ description='TEST DATE INPUT RUNTIME DAG',
+ ...
+ render_template_as_native_obj=True
+) as dag:
+...
+
+Your OracleStoredProcedureOperator call should now function as expected, passing Jinja rendered parameters as native Python objects (datetime in this case).
+For more information please refer to official Airflow's documentation:
+https://airflow.apache.org/docs/apache-airflow/stable/concepts/operators.html#rendering-fields-as-native-python-objects
",python
+"Define and use class methods from variableLet's say I have two classes with some methods.
+First class:
+@dataclass
+class MyFirstClass:
+ y1: List[float]
+ y2: List[float]
+
+ @property
+ def my_func1(self) -> List[float]:
+ return self.y1-self.y2
+
+ @property
+ def my_func2(self) -> List[float]:
+ return self.y1+self.y2
+
+ @property
+ def my_func3(self) -> List[float]:
+ return self.y1*self.y2
+
+Second class:
+@dataclass
+class MySecondClass:
+ list_vals: List[float]
+
+ @property
+ def norm1(self):
+ return self.list_vals
+
+ @property
+ def norm2(self):
+ return self.list_vals / numpy.mean(self.list_vals)
+
+So my question is two fold.
+Say in my main I have something like:
+def main():
+ FUNCTION = <?>
+ NORM = <?>
+
+ y1 = [1, 2, 3, 4, 5]
+ y2 = [3, 4, 5, 6, 7]
+
+ test = MyFirstClass(y1, y2).my_func1
+
+Then I just call MyFirstClass with the y1 and y2 lists and get an output.
+But as can be seen in the beginning of main I have FUNCTION and NORM. Is there any way to call MyFirstClass from there, and then reuse it all through main, i.e. something like:
+FUNCTION = MyFirstClass.my_func1
+
+test = FUNCTION(y1, y2)
+
+This doesn't work obviously. So how can one do that ?
+Also, and this is probably a build upon the above, how can I, once again, choose the norm function to be used in the other class ? For instance, if I update MyFirstClass a bit to:
+@dataclass
+ class MyFirstClassUpdated:
+ y1: List[float]
+ y2: List[float]
+ norm: NORM
+
+ @property
+ def my_func1(self) -> List[float]:
+ return norm(self.y1)-norm(self.y2)
+
+ @property
+ def my_func2(self) -> List[float]:
+ return norm(self.y1)+norm(self.y2)
+
+ @property
+ def my_func3(self) -> List[float]:
+ return norm(self.y1)*norm(self.y2)
+
+And then when the class is called it takes the NORM argument from the main function, i.e. something like:
+FUNCTION = MyFirstClass.func1
+NORM = MySecondClass.norm1
+
+test = FUNCTION(y1, y2, NORM)
+
+I have no idea if this is even possible without making the code "uglier".
","You need to create an instance of MyFirstClass, and pass it as the self argument to the function.
+FUNCTION = MyFirstClass.my_func1
+mfc = MyFirstClass(y1, y2)
+test = FUNCTION(mfc)
+
",python
+"Python: removing objects from a list by attribute valueclass MyClass():
+ def __init__(self, name, att1, att2):
+ ...
+
+myList = [MyClass("p1", 1, 1), MyClass("p2", 0, 0), MyClass("p3", 0, 1)]
+
+Now I want to remove every object from myList if its att2 == 1.
","Rather than removing the class instance from the list, construct a new list (list comprehension) that excludes unwanted classes. For example:
+class MyClass():
+ def __init__(self, name, att1, att2):
+ self._name = name
+ self._att1 = att1
+ self._att2 = att2
+ def __repr__(self):
+ return f'{self._name=}, {self._att1=}, {self._att2=}'
+
+myList = [MyClass("p1", 1, 1), MyClass("p2", 0, 0), MyClass("p3", 0, 1)]
+
+myList = [c for c in myList if c._att2 != 1]
+
+print(myList)
+
+Output:
+[self._name='p2', self._att1=0, self._att2=0]
+
",python
+"Use Conditional within Context MangerI want to be able to do something like this:
+file1 = 'data/f1.txt'
+file2 = None
+file3 = 'data/f3.txt'
+with open(file1) as f1, open(file2) as f2, open(file3) as f3:
+ for i in range(1000):
+ x,y,z = func(i)
+ if f1: f1.write(f"{x}\n")
+ if f2: f2.write(f"{y}\n")
+ if f3: f3.write(f"{z}\n")
+
+
+So that output is only written to files if paths are provided.
+I tried the following:
+file1 = 'data/f1.txt'
+file2 = None
+file3 = 'data/f3.txt'
+with open(file1) if file1 else None as f1, open(file2) if file2 else None as f2, open(file3) if file3 else None as f3:
+ for i in range(1000):
+ x,y,z = func(i)
+ if f1: f1.write(f"{x}\n")
+ if f2: f2.write(f"{y}\n")
+ if f3: f3.write(f"{z}\n")
+
+But I got TypeError: expected str, bytes or os.PathLike object, not NoneType.
+Is there a simple, pythonic way to achieve this?
+Thanks.
","By the time you've invented identifiers file{1,2,3},
+that's starting to be a Code Smell that you
+really want to fill a container with files.
+Create a class FilesWriter that is a context manager.
+Pass in a bunch of filespecs.
+It will simply discard any empty ones:
+ self.fspec_to_file = {}
+ for fspec in fspecs:
+ if fspec is not None:
+ self.fspec_to_file[fspec] = open(fspec, 'w')
+
+Now your instance is responsible for closing those
+when the with calls your __exit__ method.
+In the body of the with,
+caller will either invoke a write_to_all(s: str) method you provide,
+or will use the elements of fspec_to_file as needed.
+So calling sequence would look like:
+ with FilesWriter([fspec1, fspec2, fspec3]) as fw:
+ for i in range(1000):
+ result = compute()
+ fw.write_to_all(result)
+
+and upon leaving the block your __exit__ method does:
+ for f in self.fspec_to_file.values():
+ f.close()
+
+Strictly speaking .close() can fail, e.g. due to full disk.
+Put it in a try block if needed.
+If any open() fails, it might make sense to
+close all previously opened files and die
+with fatal error.
+You seemed to need to know details about each open file.
+But if not, feel free to convert that dict to
+a simple set of open file handles.
",python
+"Simple CNN Binary Classification Network with dataset consisting of more than 100000 image filesI am trying to build a simple CNN model for binary classification but the training dataset consists of over 100k of '.png' file. If I train the model by loading all the data at once, it will create a MemoryExhaustion Error. Can somebody help me to build the network to deal with such huge dataset?
","You can stream with yield statement.
+def load_at_once(image_names):
+ return [load(image_name) for image_name in image_names] # memory exhaust
+
+def load_stream(image_names):
+ for image_name in image_names:
+ yield load(image_name)
+
+You can iterate images with for statement. load_stream function will load image one by one and prevent memory exhaust if you don't try saving all images in memory.
+Of course streaming is slower than loading everything to memory when you use images more than one time, because it will read image every time you want to use.
",python
+"converting 3 digit integers in a column to dates?I have an a column with 3 digit integers as m/dd. e.g.
+410
+417
+505
+522
+
+I want to convert them to
+2022-04-10
+2022-04-17
+2022-05-05
+2022-05-22
+
+How can I accomplish this?
","You can use .apply() to format the integers as strings, and then use pd.to_datetime() to turn those strings into dates. Notably, this approach works even if the month is represented by two digits (i.e. October, November, or December):
+import pandas as pd
+
+df = pd.DataFrame([410, 417, 505, 522, 1222], columns=["dates"])
+df["dates"] = df["dates"].apply(lambda x: "{:02}/{:02}/2022".format(x // 100, x % 100))
+df["dates"] = pd.to_datetime(df["dates"], format="%m/%d/%Y")
+
+This outputs:
+ dates
+0 2022-04-10
+1 2022-04-17
+2 2022-05-05
+3 2022-05-22
+4 2022-12-22
+
",python
+"Communicating Python Exception to a shell scriptI have a test.py python script which contains following code.
+def f(x):
+ if x < 0:
+ raise Exception("negative number")
+
+ else:
+ return x
+
+I have written another shell script test.sh that runs the python function inside it. The code is as follows
+#!/bin/bash
+X=$1
+
+y=$(python3 -c "from test import f; print(f(`echo $X`))")
+echo this is y: $y
+
+The shell script works fine when input is positive i.e bash test.sh 1. This gives this is y: 1.
+However, when the input is negative i.e bash test.sh -1. It gives a python traceback.
+Traceback (most recent call last):
+ File "<string>", line 1, in <module>
+ File "/home/user/test.py", line 4, in f
+ raise Exception("negative number")
+Exception: negative number
+this is y:
+
+Question: what changes should be made to avoid the above output (avoid printing traceback).
+Expected output:
+this is y: exception
+
","Use a try/except that prints exception.
+y=$(python3 -c "from test import f
+try: print(f(`echo $X`))
+except: print('exception')")
+
",python
+"Create Xpath using scrapyimport scrapy
+from scrapy.http import Request
+from scrapy.crawler import CrawlerProcess
+
+class TestSpider(scrapy.Spider):
+ name = 'test'
+ start_urls = ['https://rejestradwokatow.pl/adwokat/list/strona/1/sta/2,3,9']
+ custom_settings = {
+ 'CONCURRENT_REQUESTS_PER_DOMAIN': 1,
+ 'DOWNLOAD_DELAY': 1,
+ 'USER_AGENT': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
+ }
+
+
+ def parse(self, response):
+ books = response.xpath("//td[@class='icon_link']//a//@href").extract()
+ for book in books:
+ url = response.urljoin(book)
+ yield Request(url, callback=self.parse_book)
+
+
+
+ def parse_book(self, response):
+ wev={}
+ d1=response.xpath("//*[@class='line_list_K']//div//span")
+ for i in range(len(d1)):
+ if 'Status:' in d1[i].get():
+ d2=response.xpath("//div["+str(i+1)+"]//text()").get()
+ print(d2)
+
+I will get the status value but they will give me empty output this is page link https://rejestradwokatow.pl/adwokat/abramska-danuta-51494
+![]()
","Why not selecting your element more specific by its text and getting the text from its next sibling:
+//span[text()[contains(.,'Status')]]/following-sibling::div/text()
+
+Example: http://xpather.com/ZUWI58a4
+To get the email:
+//span[text()[contains(.,'Email')]]/following-sibling::div/(concat(@data-ea,'@',@data-eb))
+
",python
+"How can i print the specific line number of a json fileI am using python to load a json file and using jsonschema to print errors according to the schema i have prepared.
+My question is how do i print a specific line of a json file from a loop:
+errors = sorted(validator.iter_errors(jsonData[a]), key=lambda e: e.path)
+for error in errors:
+ print(error.message, sep=", ")
+
+The output i get is 'lending_details' is a required property which is the error.message.
+What i want is to print: On line number 4 ,'lending_details' is a required property
+Is there a way to count and display the specific line number of a json file?
","In general, no, because by the time the JSON Schema evaluator sees the data instance, the data has been parsed from JSON text into a data structure and the line number information has been lost.
+To make this work, you will need to have a JSON decoder that can associate line numbers with each section of the data in a way that the JSON Schema evaluator can later make use of it when generating its errors. For example, I could see a decoder turning this JSON:
+{
+ "foo": {
+ "hello": [
+ "a",
+ "b",
+ "c"
+ ]
+ "bar": true
+}
+
+into this line number mapper:
+{
+ "": 1,
+ "/foo": 2,
+ "/foo/hello": 3,
+ "/foo/hello/0": 4,
+ "/foo/hello/1": 5,
+ "/foo/hello/2": 6,
+ "/bar": 8
+}
+
+..and then when the JSON Schema evaluator is generating an error at data instance "/bar", we can use this lookup table to insert "..at line 8" into the error.
",python
+"I'm trying to make a secure password generating program. Well, I was debugging the Index Error that I get when I ran this incomplete programI seem to have run into wall here. It keeps showing an Index Error at line "unordered_letter[letter] = random.choice(letters[nr_letters - 1])". My inexperienced eyes are unable to catch the issue so kindly help with the same. Thanks in advance!
+import random
+
+letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
+ 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R',
+ 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
+numbers = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
+symbols = ['!', '#', '$', '%', '&', '(', ')', '*', '+']
+
+print("Welcome to the PyPassword Generator!")
+nr_letters = int(input("How many letters would you like in your password?\n"))
+nr_symbols = int(input(f"How many symbols would you like?\n"))
+nr_numbers = int(input(f"How many numbers would you like?\n"))
+
+unordered_letter = []
+
+for letter in range(0, nr_letters):
+ unordered_letter[letter] = random.choice(letters[nr_letters - 1])
+print(unordered_letter)
+
","Change:
+for letter in range(0, nr_letters):
+ unordered_letter[letter] = random.choice(letters[nr_letters - 1])
+
+To this:
+for letter in range(0, nr_letters):
+ unordered_letter.append(random.choice(letters))
+
+Change #1:
+Use the .append() function. Your original code produces an error since you are trying to set a value to a nonexistent index since your array is empty.
+Change #2:
+Change random.choice(letters[nr_letters - 1]) to random.choice(letters) if you want to produce a variety of letters rather than one singular letter. Your original code will only append the nr_letters - 1th element of letters, while this new code will choose random elements from letters.
+Use the same process for symbols and numbers.
",python
+"Is there any method to transfer files between docker container using python scripts?Project in container A and has a function to restore data,which may need to send files to container B
+Or I should not do like this?
","try using volume option when you run container A & B
+docker run -t imagename:tag -v 'data path shared on local': path data on container A container A
+docker run -t imagename:tag -v 'data path shared on local': path data on container B container B
+when restore new file in container A in access in path local and container B could used it
",python
+"How do you use patch() as a context manager?I have a class that mocks database functionality which does not subclass Mock or MagicMock because it defines its own __init__ method:
+class DatabaseMock():
+ def __init__(self, host=None):
+ self.host = host
+ self.x = {}
+
+ # other methods that mutate x
+
+There is a function I want to test that makes an API call to the real database, so I patched it out:
+class TestFunctions():
+ def test_function(self):
+ with patch("path.to.database.call", DatabaseMock) as mock:
+ result = function_i_am_testing()
+ assert mock.x == result
+
+There is a field of the DatabaseMock called x, but in the patch context, mock.x returns
+an AttributeError. This leads to me believe mock is not really an instance of DatabaseMock. Also, I had tried making x a class level object which does make x visible, but its state would persist through separate test calls which I do not want.
+What is mock and how can I reference the mocked object in the context?
","I have figured out the issue. When patch is given a class, it will return a class, not an object of that class. So mock in my example is not a DataBaseMock object, but a reference to the class. This is why class level variables are visible, but not object fields. In order get my desired functionality, I did this:
+class TestFunctions():
+ def test_function(self):
+ with patch("path.to.database.call") as mock:
+ mock.return_value = DataBaseMock()
+ result = function_i_am_testing()
+ assert mock.return_value.x == result
+
+Now, mock is a MagicMock object, whose return value is the object I need.
",python
+"Remove series border lines from plotly express area chartpx.area shows a line atop each area series. How can I remove it?
+The documentation only shows how to remove lines from go.Scatter calls (with mode='none').
+Here's an example (notebook), where I'd like to remove the dark blue and red lines atop the light blue and red areas, respectively (to avoid the perception that the red series is nonzero where it stacks atop the blue series):
+import plotly.express as px
+
+px.area(y=[[1, 2, 3], [0, 0, 1]])
+
+![]()
","IIUC this is what you're looking for:
+import plotly.express as px
+
+fig = px.area(y=[[1, 2, 3], [0, 0, 1]])
+
+Before:
+![]()
+If you want lines same color as traces:
+for i in range(len(fig['data'])):
+ fig['data'][i]['line']['width']=0
+
+![]()
+If you want traces same color as lines:
+fig.for_each_trace(lambda trace: trace.update(fillcolor = trace.line.color))
+
+![]()
",python
+"Create CSV output with an uneven dictionary of single values and listsI am using python and running my example on google colab. I am new to csv writing.
+I want to csv write out like the following.
+desired output example
+The issue I am having is with my lists in 'add' and 'reg'. When I run my example code I get the following.
+my output
+What should I do to get my desired output? Below is my attempt. If there is another post that has a similar situation as mine could you link it as I have already tried searching and could not find someone with the the issue as mine.
+d = [{'local_date': '12/16/2022', 'local_time': '12:00', 'add':[2, 8, 22, 17], 'reg':[1001, 1002, 1003, 1004] }]
+fields = ['local_date', 'local_time', 'add', 'reg']
+
+with open("tester.csv", "w") as outfile:
+ writer = csv.DictWriter(outfile, fieldnames = fields)
+ writer.writeheader()
+ writer.writerows(d)
+
","using csv
+import csv
+
+d = [{'local_date': '12/16/2022', 'local_time': '12:00', 'add':[2, 8, 22, 17], 'reg':[1001, 1002, 1003, 1004]}]
+
+fields = ['local_date', 'local_time', 'add', 'reg']
+
+# we parse the d list to get the create new row with each add, reg value
+data = [{**dat, 'add':add, 'reg':reg} for dat in d for add, reg in zip(dat['add'], dat['reg'])]
+
+with open("tester.csv", "w") as outfile:
+ writer = csv.DictWriter(outfile, fieldnames = fields)
+ writer.writeheader()
+ writer.writerows(data)
+
+using pandas
+import pandas as pd
+
+d = [{'local_date': '12/16/2022', 'local_time': '12:00', 'add':[2, 8, 22, 17], 'reg':[1001, 1002, 1003, 1004]}]
+
+df = pd.DataFrame(d).explode(['add', 'reg'])
+
+df.to_csv('tester.csv', index=False)
+
",python
+"Parse all valid datetime strings in json recursivelyI have a json blob of the following format. Is there a way to identify all strings which match the format
+%Y-%m-%dT%H:%M:%S
+
+And convert them to datettime strings
+{
+ "data":[
+ {
+ "name":"Testing",
+ "dob":"2001-01-01T01:00:30"
+ },
+ {
+ "name":"Testing2",
+ "dob":"2001-01-01T01:00:30",
+ "licence_info":{
+ "issue_date":"2020-01-01T01:00:30"
+ }
+ }
+ ]
+}
+
","The easiest way to do this is to parse each value and attempt to convert it to a datetime. You could do something like this:
+from datetime import datetime
+
+def convert_dates(value):
+ if isinstance(value, dict):
+ return { k : convert_dates(v) for k, v in value.items() }
+ elif isinstance(value, list):
+ return [ convert_dates(v) for v in value ]
+ else:
+ try:
+ dt = datetime.strptime(value, '%Y-%m-%dT%H:%M:%S')
+ return dt
+ except ValueError:
+ return value
+
+jstr = '''
+{
+ "data":[
+ {
+ "name":"Testing",
+ "dob":"2001-01-01T01:00:30"
+ },
+ {
+ "name":"Testing2",
+ "dob":"2001-01-01T01:00:30",
+ "licence_info":{
+ "issue_date":"2020-01-01T01:00:30"
+ }
+ }
+ ]
+}
+'''
+d = json.loads(jstr)
+convert_dates(d)
+
+Output:
+{
+ 'data': [
+ {'name': 'Testing',
+ 'dob': datetime.datetime(2001, 1, 1, 1, 0, 30)
+ },
+ {'name': 'Testing2',
+ 'dob': datetime.datetime(2001, 1, 1, 1, 0, 30),
+ 'licence_info': {'issue_date': datetime.datetime(2020, 1, 1, 1, 0, 30)}
+ }
+ ]
+}
+
",python
+"i get an empty list when scraping a websitefrom bs4 import BeautifulSoup
+import requests
+
+link = requests.get("https://www.upwork.com/nx/jobs/search/?q=web%20scraping&sort=recency")
+source = link.content
+soup = BeautifulSoup(source, "lxml")
+
+title = soup.find_all("h4", {"class": "my-0 p-sm-right job-tile-title"})
+print(title)
+
+i am trying to scrap the job titles but the problem is that i get an empty list
+but in other websites it work just fine
+help me please
","You got an empty list because this data loads from a different request. You can see it if opens the console in your browser, network tab
+Needed request
",python
+"Easy Leetcode question, I don't understand what I'm doing wrongI'm beginner in coding and I am working on some easy leetcode questions along the way. The question is converting roman numerals to integers and when I run this code, it says the "string index out of range". Rather than looking for other answers, I wanted to understand what I did wrong. I appreciate the help!
+s = "CCXLVII"
+
+roman_dict = {
+ 'C' : 100,
+ 'L' : 50,
+ 'X' : 10,
+ "V" : 5,
+ "I" : 1
+}
+
+temp = 0
+
+for i in range(len(s)):
+ if roman_dict[s[i]] > roman_dict[s[i+1]] and i + 1 < len(s): #string index out of range
+ temp = temp - roman_dict[s[i]]
+ else:
+ temp = temp + roman_dict[s[i]]
+
","You just have to swap the order of the conditionals:
+Change:
+if roman_dict[s[i]] > roman_dict[s[i+1]] and i + 1 < len(s):
+
+To:
+if i + 1 < len(s) and roman_dict[s[i]] > roman_dict[s[i+1]]:
+
+This makes it so that the boundary check is done before attempting to access the i+1 index. Python will short-circuit the condition as soon as i + 1 >= len(s) is true
",python
+"How to persist an inmemory monetdbe db to local diskI am using an embedded monetdb database in python using Monetdbe.
+I can see how to create a new connection with the :memory: setting
+But i cant see a way to persist the created database and tables for use later.
+Once an in memory session ends, all data is lost.
+So i have two questions:
+
+- Is there a way to persist an in memory db to local disk
+and
+- Once an in memory db has been saved to local disk, is it possible to load the db to memory at a later point to allow fast data analytics. At the moment it looks like if i create a connection from a file location, then my queries are reading from local disk rather memory.
+
","It is a little bit hidden away admittedly, but you can check out the following code snipet from the movies.py example in the monetdbe-examples repository:
+import monetdbe
+
+database = '/tmp/movies.mdbe'
+
+with monetdbe.connect(database) as conn:
+ conn.set_autocommit(True)
+ conn.execute(
+ """CREATE TABLE Movies
+ (id SERIAL, title TEXT NOT NULL, "year" INTEGER NOT NULL)""")
+
+So in this example the single argument to connect is just the desired path to your database directory. This is how you can (re)start a database that stores its data in a persistent way on a file system.
+Notice that I have intentionally removed the python lines from the example in the actual repo that start with the comment # Removes the database if it already exists. Just to make the example in the answer persistent.
+I haven't run the code but I expect that if you run this code twice consecutively the second run wil return a database error on the execute statement as the movies table should already be there.
+And just to be sure, don't use the /tmp directory if you want your data to persist between restarts of your computer.
",python
+"How to parse JSONP response in python?![]()
+How can i parse JSONP response, i tried json.loads(), but it will never work for JSONP
","By the reading following
+
+JSONP is JSON with padding, that is, you put a string at the beginning
+and a pair of parenthesis around it.
+
+I tried to remove padding from the string and used json.loads()
+from json import loads
+response = requests.get(link)
+startidx = response.text.find('(')
+endidx = response.text.rfind(')')
+data = loads(response.text[startidx + 1:endidx])
+
+it's working
",python
+"OpenCV Limit FPS While RunningI'm using opencv python to do image processing. When I measure the FPS it's not stable. Sometimes FPS is 10 sometimes 12. I want to make FPS stable at 9 frame per second. Is there anyway to do that?
+EDIT: I'm using my laptop's webcam. But I also have an Hikvision IP Camera. I need to do that independent from camera.
+Here is how I'm measuring FPS.
+while True:
+
+ timer = cv2.getTickCount()
+ ret, img = cap.read()
+
+ fps = cv2.getTickFrequency()/(cv2.getTickCount()-timer)
+ cv2.putText(img, str(int(fps)), (75, 75),cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
+ cv2.imshow("Tracking", img)
+
+ if cv2.waitKey(1) & 0xFF == ord('q'):
+ break
+
+cap.release()
+cv2.destroyAllWindows()
+
","I don't know why you need that but here is your answer:
+You can not expect a camera to give always exact/constant fps because of some low level device design configurations. If 30 fps written in a camera spec, its fps can vary 30~33.
+What you can do is that you can pull 9 frames from the buffer in a sec and ignore the extra frames which camera feeds.
+Here is a good discussion to check.
",python
+"Get elements from list in python by ¿index?I want to get the ids from this list to put them in an other to use
+random.choice(list)
+
+
+[<Member id=986970159736586261 name='Nasgar-Bot' discriminator='5799'
+bot=True nick=None guild=>, <Member
+id=568157479020527636 name='ElmerKao' discriminator='0058' bot=False
+nick=None guild=>]
+
+How can i get the id from here and put them into a list to use that command?
","The way you worded you problem wasn't very clear but I think I understood, you want to add a user id in a list.Let's imagine the user sent a message. that you passed to your function with ctx.
+ids = []
+discord_id = ctx.message.author.id
+ids.append(discord_id)
+
+I am not sure if this answer is clear enough but try to give more context when asking a question.
+PS: don't use list as a name for a list, it will break your code.
",python
+"Merging pandas dataframes to fill in the gapsHave been struggling with this for a bit today. I've got a master dataframe that is missing some values, and a secondary one that has these values which I would like to add in. The key to match on is column 1.
+d1 = {1:['Test','Test1','Test2'], 2:['A','B','C']}
+d2 = {1:['Something','Test','Test1','Test2','Test3','Test4'], 2:['z',None,None,None,'x','y'],3:['Blah','Blah','Blah','Blah','Blah','Blah']}
+
+df1 = pd.DataFrame(data=d1)
+df2 = pd.DataFrame(data=d2)
+
+df1
+ 1 2
+0 Test A
+1 Test1 B
+2 Test2 C
+
+df2
+ 1 2 3
+0 Something z Blah
+1 Test None Blah
+2 Test1 None Blah
+3 Test2 None Blah
+4 Test3 x Blah
+5 Test4 y Blah
+
+
+
+The outcome I'm looking for is:
+ 1 2 3
+0 Something z Blah
+1 Test A Blah
+2 Test1 B Blah
+3 Test2 C Blah
+4 Test3 x Blah
+5 Test4 y Blah
+
+
+Any ideas?
","You can use a map and fillna:
+df2[2] = df2[2].fillna(df2[1].map(df1.set_index(1)[2]))
+
+Output:
+ 1 2 3
+0 Something z Blah
+1 Test A Blah
+2 Test1 B Blah
+3 Test2 C Blah
+4 Test3 x Blah
+5 Test4 y Blah
+
",python
+"How to bypass Terms and Conditions agreement with Beautiful SoupI want to scrape this website: https://cage.dla.mil/Home/UsageAgree using Beautiful Soup.
+What I'm doing:
+import requests
+url = "https://cage.dla.mil/Home/UsageAgree"
+r = requests.get(url)
+soup = BeautifulSoup(r.content, "html.parser")
+print(soup)
+
+which returns HTML from a cookie agreement page.
+What I am then looking for is to bypass this to scrape the content of the actual page once we accept the cookies.
+I followed this post: Scraping a webpage using Python (beautiful soup) that requires "I agree to cookies" button being clicked?
+and did:
+import requests
+url = 'https://cage.dla.mil/'
+s = requests.Session()
+s.cookies.update({'agree': 'True'})
+s.get(url)
+soup = BeautifulSoup(r.content, "html.parser")
+print(soup)
+
+but I'm still getting the agreement page.
+It seems that one of the cookies always gives a unique value. I'm not sure how to deal with this.
","Well, this should work.
+import requests
+from bs4 import BeautifulSoup
+
+headers = {
+ "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:101.0) Gecko/20100101 Firefox/101.0"
+}
+
+with requests.Session() as s:
+ token = (
+ BeautifulSoup(
+ s.get(
+ "https://cage.dla.mil/Home/UsageAgree",
+ headers=headers,
+ ).text,
+ "lxml",
+ ).select_one("form input")["value"]
+ )
+ payload = {
+ "__RequestVerificationToken": token,
+ "returningURL": "",
+ }
+ _ = s.post(
+ "https://cage.dla.mil/Home/UsageAgree",
+ data=payload,
+ headers=headers
+ )
+ soup = (
+ BeautifulSoup(
+ s.get("https://cage.dla.mil/", headers=headers).text,
+ "lxml",
+ ).select("#briefnewslist > div > p > em")
+ )
+ print("\n".join(p.getText(strip=True) for p in soup))
+
+Output:
+Scheduled Maintenance
+SAM Validation: Unable To Find A Matching Entity When Asked To Enter Or Validate My Entity Information
+SAM Validation: Continue A Registration Update Or Renewal If Validation Fails
+SAM.gov Registration for Financial Assistance
+Financial Assistance Update
+CAGE Expiration Date
+
",python
+"How to extract GPS location from HEIC files?I use the following code to extract GPS location both from JPG and HEIC files:
+#coding=utf-8
+from PIL import Image
+from urllib.request import urlopen
+from PIL.ExifTags import TAGS
+from PIL.ExifTags import GPSTAGS
+
+from pillow_heif import register_heif_opener
+
+def get_exif(filename):
+ image = Image.open(filename)
+ image.verify()
+ return image._getexif()
+
+def get_geotagging(exif):
+ if not exif:
+ raise ValueError("No EXIF metadata found")
+
+ geotagging = {}
+ for (idx, tag) in TAGS.items():
+ if tag == 'GPSInfo':
+ if idx not in exif:
+ raise ValueError("No EXIF geotagging found")
+
+ for (key, val) in GPSTAGS.items():
+ if key in exif[idx]:
+ geotagging[val] = exif[idx][key]
+
+ return geotagging
+
+register_heif_opener()
+
+my_image='IMG_9610.HEIC'
+#my_image='IMG_20210116_215317.jpg'
+
+exif = get_exif(my_image)
+labeled = get_geotagging(exif)
+print(labeled)
+
+This code works well with JPEG files, but returns the following error with HEIC:
+AttributeError: _getexif
+
+If I add the following function
+def get_labeled_exif(exif):
+ labeled = {}
+ for (key, val) in exif.items():
+ labeled[TAGS.get(key)] = val
+
+ return labeled
+
+and replace '_getexif()' with 'getexif()' then it works for both files, but the data is encrypted there - 'GPSInfo': 1234 and get_geotagging() doesn't work for such exif.
+How could I fix it?
","UPDATED POST 06-12-2022
+The code below is able to extract the GEO tagging information from a HEIC image file on my system.
+from PIL import Image
+from pillow_heif import register_heif_opener
+
+
+def get_exif(filename):
+ image = Image.open(filename)
+ image.verify()
+ return image.getexif().get_ifd(0x8825)
+
+
+def get_geotagging(exif):
+ geo_tagging_info = {}
+ if not exif:
+ raise ValueError("No EXIF metadata found")
+ else:
+ gps_keys = ['GPSVersionID', 'GPSLatitudeRef', 'GPSLatitude', 'GPSLongitudeRef', 'GPSLongitude',
+ 'GPSAltitudeRef', 'GPSAltitude', 'GPSTimeStamp', 'GPSSatellites', 'GPSStatus', 'GPSMeasureMode',
+ 'GPSDOP', 'GPSSpeedRef', 'GPSSpeed', 'GPSTrackRef', 'GPSTrack', 'GPSImgDirectionRef',
+ 'GPSImgDirection', 'GPSMapDatum', 'GPSDestLatitudeRef', 'GPSDestLatitude', 'GPSDestLongitudeRef',
+ 'GPSDestLongitude', 'GPSDestBearingRef', 'GPSDestBearing', 'GPSDestDistanceRef', 'GPSDestDistance',
+ 'GPSProcessingMethod', 'GPSAreaInformation', 'GPSDateStamp', 'GPSDifferential']
+
+ for k, v in exif.items():
+ try:
+ geo_tagging_info[gps_keys[k]] = str(v)
+ except IndexError:
+ pass
+ return geo_tagging_info
+
+
+register_heif_opener()
+
+my_image = 'IMG_8362.heic'
+image_info = get_exif(my_image)
+results = get_geotagging(image_info)
+print(results)
+# x used to mask data
+{'GPSLatitudeRef': 'N',
+'GPSLatitude': '(3x.0, 5x.0, 1x.0x)',
+'GPSLongitudeRef': 'W',
+'GPSLongitude': '(8x.0, 2x.0, 5x.2x)',
+'GPSAltitudeRef': "b'\\x00'",
+'GPSAltitude': '279.63243243243244',
+'GPSSpeedRef': 'K',
+'GPSSpeed': '0.04649941997239198',
+'GPSImgDirectionRef': 'T',
+'GPSImgDirection': '274.37165833514456',
+'GPSDestBearingRef': 'T',
+'GPSDestBearing': '27x.37165833514456',
+'GPSDateStamp': '2022:06:12'}
+
+----------------------------------------
+My system information
+----------------------------------------
+Platform: Apple
+OS Version: macOS Catalina 10.15.7
+Python: 3.9
+Pillow: 9.1.1
+pillow_heif: 0.3.0
+----------------------------------------
+
+ORIGINAL POST 06-11-2022
+The short answer is that Pillow does not currently support the High Efficiency Image Format (HEIF) file format.
+Reference:
+
+One of the workarounds for this issue is pyheif. This Python package has the functionality to covert a HEIC image to a JPEG one. After this transformation Pillow will be able to read the data from the image.
+Another workaround for this format reading problem is piexif. Here is an answer that I posted on converting a TIFF file to a JPEG one for reading with Pillow.
+You could also use ExifTool, which reads HEIC files out of the box. Using it is slightly more complicated, because it requires using subprocess.
",python
+"Python function overloading recommended approachAssume a function that takes an object as parameter. There could be various ways to express the parameter object creation, some of which expressive, and likely easier to be used.
+To give a simple example, we have a function which takes DateTime. We also want to accept string representations of DateTime, if possible (for example '20220606').
+# version 1, strict. must send a DateTime
+def UsefulFunc(startdate: DateTime) -> None:
+ pass
+
+# version 2, allow more types, but loose on type hints
+def UsefulFunc(startdate: (DateTime, str)) -> None:
+ # check if type is str, convert to DateTime if yes
+ pass
+
+# version 3, multiple signatures to accept and call the base function
+def UsefulFuncString(startdatestr: str) -> None:
+ # convert startdatestr to DateTime
+ UsefulFunc(startdate)
+
+# … …
+
+What approach is recommended in Python (I come from C# background)? If there's no clear indication/ or decision is based on situation, what are the considerations?
","After some research, and taking inspirations from the @Copperfield answer, I found an elegant solution to the problem.
+Let's first rephrase the problem - we have a function that takes an object. We want to provide some overloads, which will do the validations/ conversions etc. and call the base function which accepts object. We also need to reject any call not following any function signature which are not implemented.
+The library that I found very useful was multipledispatch. An easy example:
+from multipledispatch import dispatch
+
+@dispatch(int, int)
+def add_nums(num1: int, num2: int) -> int:
+ return num1 + num2
+
+@dispatch(str, str)
+def add_nums(num1: str, num2: str) -> int:
+ # do some useful validations/ object transformations
+ # implement any intermediate logic before calling the base func
+ # this enables base function do it's intended feature rather than
+ # implementing overloads
+ return add_nums(int(num1), int(num2))
+
+
+if we call add_nums(40, 15), we get 55 as the (int, int) version get called. add_nums('10', '15') get us 25 as expected as (str, str) version get called.
+It becomes very interesting when we call add_nuns(10, 10.0) as this will fail saying NotImplementedError: Could not find signature for add_nums: <int, float>. Essentially any call not in (int, int) or (str, str) format, fail with NotImplementedError exception.
+This is by far the closest behaviour of function overloading, when comparing with typed languages.
+The only concern I have - this library was last updated on Aug 9, 2018.
",python
+"Pyspark dataframe is changed while the 'write' function usageThis is a real mystery for me: when I try to write a pyspark df into Azure dataframe using jdbc, I run into a strange situation. While running the 'write' function my table is changed somehow without any reason, and sends to Azure wrong data. Afterwards it saves my pyspark df with the same wrong data. Here is a part of code I have written:
+print(sparkDF_cleaned.show())
+sparkDF_cleaned.write \
+ .format("jdbc") \
+ .mode("overwrite")
+ .option("url", jdbcUrl) \
+ .option("dbtable", "dbo.upsert_test") \
+ .option("user", jdbcUsername) \
+ .option("password", jdbcPassword) \
+ .save()
+print(f"data loaded to table {db_table_name}")
+print(sparkDF_cleaned.show())
+
+Output is next:
+sparkDF_cleaned :
++------------+---+-----+----------+----------------------+
+| id_date| id|value| _date|datetime_of_extraction|
++------------+---+-----+----------+----------------------+
+|1 2022-05-01| 1| 17|2022-05-01| 2022-06-01|
+|1 2022-05-06| 1| 6|2022-05-06| 2022-06-13|
+|2 2022-05-02| 2| 10|2022-05-02| 2022-06-01|
+|3 2022-05-03| 3| 15|2022-05-03| 2022-06-01|
++------------+---+-----+----------+----------------------+
+
+data loaded to table upsert_test
+
+sparkDF_cleaned :
++------------+---+-----+----------+----------------------+
+| id_date| id|value| _date|datetime_of_extraction|
++------------+---+-----+----------+----------------------+
+|1 2022-05-06| 1| 6|2022-05-06| 2022-06-13|
+|2 2022-05-02| 2| 5|2022-05-02| 2022-06-13|
++------------+---+-----+----------+----------------------+
+
+Azure table recieves the data as it is in the second table. Dear colleges, why does it happen?
+Thanks for your time in advance.
","I can't still avoid the problem when using pyspark jdbc overwrite with pyspark transformed table.
+Hovewer, I have decided to write pyspark table to local file, read saved data from the file and than post the data using the pyspark jdbc overwrite function.
+It does not seem to be the best solution, hovewer, it works. If there will be better solutions, will be happy to read.
+path = "*DataBricksWorkspace*/temp_table.json"
+ dbutils.fs.rm(path, True)
+ sparkDF.write.json(path)
+
+ sparkDF_copy = spark.read.json(file)
+
+ sparkDF_copy.write.jdbc(url=jdbcUrl,
+ table=db_table_name,
+ mode="overwrite",
+ properties=connectionProperties)
+
",python
+"Making an effiecient combining and fill-up algorithmI have a list of suitcases, each suitcase has a name and a weight associated to it. I want to write a function that groups these suitcases in a way together that their weights always forms a multiple of 8 and returns a list of the formed tuples. If there is a suitcase that can not be formed to a multiple of 8ths then that suitcase gets "filled up" with 1s (this should only be the last resort). So for example:
+sc1 = suitcase("sc1", 5)
+sc2 = suitcase("sc2", 1)
+sc3 = suitcase("sc3", 3)
+sc4 = suitcase("sc4", 14)
+sc5 = suitcase("sc5", 4)
+sc6 = suitcase("sc6", 1)
+sc7 = suitcase("sc7", 8)
+sclist = [sc1,sc2,sc3,sc4,sc5,sc6,sc7]
+
+sorted_tuple = sort_suitcases(sclist)
+
+sorted_tuple = [(sc7),(sc1,sc3),(sc4,sc2,sc6),(sc5,{1,1,1,1})] # this is obviously only one of many possible combinations.
+# having only one big tuple would obviously also be a solution
+
+My approach would be looping over each value and loop over each other value left in the list and see if their weight combined is %8, but I feel like this approach would be not very efficient with big data sets. Am I missing something?
","
+- Partition the suitcases into 8 groups, based on their weight % 8.
+- Pair up members of the groups to make sums that are multiples of 8: 1 & 7, 2 & 6, 3 & 5, 4 & itself.
+- Deal with those that couldn't be paired off (larger groups & filling-up)
+
",python
+"Easiest way to parse command line string to subprocess list?I'm trying to figure out how to run this command using subprocess.run():
+cmd = 'find / \( -path /mnt -prune -o -path /dev -prune -o -path /proc -prune -o -path /sys -prune \) -o ! -type l -type f -or -type d -printf "depth="%d/"perm="%m/"size="%s/"atime="%A@/"mtime"=%T@/"ctime"=%C@/"hardlinks"=%n/"selinux_context"=%Z/"user="%u/"group="%g/"name="%p/"type="%Y\\n'
+
+I've put the command into a list, even removing items, etc:
+cmd = [
+ 'find',
+ '/',
+ '\( -path /mnt -prune -o -path /dev -prune -o -path /proc -prune -o -path /sys -prune \)',
+ '-o',
+ '! -type l',
+ '-type f',
+ '-or',
+ '-type d'
+]
+
+
+I've tried running the command using /bin/bash:
+cmd = '/bin/bash -c find / \( -path /mnt -prune -o -path /dev -prune -o -path /proc -prune -o -path /sys -prune \) -o ! -type l -type f -or -type d -printf "depth="%d/"perm="%m/"size="%s/"atime="%A@/"mtime"=%T@/"ctime"=%C@/"hardlinks"=%n/"selinux_context"=%Z/"user="%u/"group="%g/"name="%p/"type="%Y\\n'
+
+Doesn't matter. Everything I've tried does not work. Either I get no output at all, or it lists the files in my home directory, or I get an error, e.g.: b'find: paths must precede expression: ! -type l\nUsage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]\n'
+Is there any easy way to take a command that works at the command line and just parse the string into whatever list elements subprocess.run() wants?
","Parsing With shlex.split()
+After fixing the incorrect quotes in your printf string, we get:
+cmd = r'''
+find / \( -path /mnt -prune -o -path /dev -prune -o -path /proc -prune -o -path /sys -prune \) -o ! -type l -type f -or -type d -printf 'depth=%d/perm=%m/size=%s/atime=%A@/mtime=%T@/ctime=%C@/hardlinks=%n/selinux_context=%Z/user=%u/group=%g/name=%p/type=%Y\\n'
+'''
+print(shlex.split(cmd))
+
+...which emits an entirely correct result, and subprocess.call() works with it properly.
+
+Building A Correct Command Line By Hand
+In terms of what it looks like to do this by hand:
+cmd = [
+ 'find', '/',
+ '(',
+ '-path', '/mnt', '-prune',
+ '-o', '-path', '/dev', '-prune',
+ '-o', '-path', '/proc', '-prune',
+ '-o', '-path', '/sys', '-prune',
+ ')',
+ '-o', '!', '-type', 'l',
+ '-type', 'f',
+ '-or',
+ '-type', 'd',
+ '-printf', 'depth=%d/perm=%m/size=%s/atime=%A@/mtime=%T@/ctime=%C@/hardlinks=%n/selinux_context=%Z/user=%u/group=%g/name=%p/type=%Y\n'
+]
+
+Note:
+
+- Syntactic quotes change the shell's parsing mode, they don't become part of the data.
"foo" just becomes foo; "foo"bar"baz" becomes foobarbaz. So you can't/shouldn't/don't try to put those quotes into the data that Python is passing in.
+- This is true also for
\(: the backslash is shell syntax. It doesn't actually become one of find's arguments, so you leave it out.
+- Any space that isn't quoted or escaped separates words; so
-type f in shell is '-type', 'f', two separate words.
+
",python
+"How to keep committing the code to github even after pytest-check hook fails?Below is my .pre-commit-config.yaml file for my project.
+# See https://pre-commit.com/hooks.html for more hooks
+fail_fast: true
+repos:
+- repo: https://github.com/pre-commit/pre-commit-hooks
+ rev: v4.2.0
+ hooks:
+ - id: trailing-whitespace
+ - id: end-of-file-fixer
+ - id: check-yaml
+ - id: check-added-large-files
+
+- repo: local
+ hooks:
+ - id: isort
+ name: isort
+ entry: isort
+ language: python
+ types: [python]
+- repo: local
+ hooks:
+ - id: black
+ name: black
+ entry: black
+ language: python
+ types: [python]
+
+- repo: local
+ hooks:
+ - id: pytest-check
+ name: pytest-check
+ #entry: pytest tests/test_file2.py
+ entry: pytest
+ language: python
+ #args: [--maxfail=1]
+ #stages: [post-commit]
+ pass_filenames: false
+ always_run: false
+
+I would like to achieve two things as described below.
+
+- Sometimes, developer modify the business logic and test cases too and those gets fail due to some reason.And, if test cases keep on failing continuously then there is risk of loosing the code because pre-commit won't allow to commit the changes until all checks are passed. Hence, we should always want to be able to commit to feature branches even though test cases fail.
+Note : Please keep in mind that, I would like to achieve above scenario for pytest-check hook and not for other hooks.
+- As per my current configuration of pre-commit-config.yaml file, it executes complete test suit. But i wants to execute it for specific test case file.
+
+FYI. - I have already explored one approach to bypass pre-commit but it's applicable to all hook mentioned in the .pre-commit-config.yaml file.
+How can we achieve all two scenario? Please suggest your input on the same?
","Make your test into a shell script that executes, in effect,
+pytest tests/test_file2.py (or whatever you are doing now)
+true
+
+The last command could also be exit 0, or anything else that exits successfully.) That way the tests will run, with the side effects and output that you have now, but git will always treat them as successful.
",python
+"Seeking missing element in an array in PythonI have an array I with shape=(10,2). I want to probe this array for missing j=1. By missing, I mean there are no indices with j=1. As is evident from I, there are indices with j=2,3,4,5,6,7.
+For the purpose of notation, in [0,3], i=0,j=3.
+import numpy as np
+
+I=np.array([[0, 3],
+ [1, 2],
+ [1, 4],
+ [2, 5],
+ [3, 4],
+ [4, 5],
+ [4, 6],
+ [5, 7],
+ [6, 7]])
+
+The expected output is
+Missing_j=[1]
+
","If you want to check single index then you can use code from @ArrowRise comment.
+if 1 not in I[:,1]: return '1 is missing'
+
+
+If you want to get all missing indexes then you can use set() for this.
+You can convert second column to set
+set1 = set(I[:,1])
+
+and generate set with all expected indexes
+max_j = max(I[:,1])
+
+set2 = set( range(1, max_j+1) )
+
+And later you can do
+missing_j = set2 - set1
+
+
+Full working example - I added [6, 10] to have more missing indexes.
+import numpy as np
+
+I = np.array([
+ [0, 3],
+ [1, 2],
+ [1, 4],
+ [2, 5],
+ [3, 4],
+ [4, 5],
+ [4, 6],
+ [5, 7],
+ [6, 7],
+ [6, 10],
+])
+
+set1 = set( I[:,1] )
+
+max_j = max(I[:,1])
+set2 = set( range(1, max_j+1) )
+
+missing_j = sorted( set2 - set1 )
+
+print( missing_j )
+
+Result:
+[1, 8, 9]
+
",python
+"only list-like objects are allowed to be passed to isin(), you passed a [str]I am trying to create dashboard by using the plotly python. Require to create dropdown for date selection for the pie chart. All the data are come from a .csv file.
+Expectation: The data displayed in the pie chart are based on the date selected.
+Data:
+enter image description here
+Code:
+date_category = list(df['Date'].unique())
+
+app.layout = ...,
+
+ dcc.Dropdown(id='date_drdn', multi=False, value= ['02/01/2022'],
+ options = [{'label':x, 'value':x}
+ for x in date_category]
+ ),
+
+ dcc.Graph(id='pie-fig', figure={})
+
+@app.callback(
+ Output('pie-fig', 'figure'),
+ Input('date_drdn', 'value'))
+
+
+def update_graph(selection):
+
+ dff = df[df['Date'].isin(selection)]
+ fig = px.pie(dff, values='Transactions', names='Product', color_discrete_sequence=px.colors.sequential.RdBu)
+ fig.update_traces(textinfo= "label+value+percent").update_layout(title_x=0.5)
+ return fig
+
+However, it keep on showing the error message when select the date.
+
Error message:"only list-like objects are allowed to be passed to isin(), you passed a [str]"
+And the data is not display based on the date selected.
+Does anyone know why and how to solve it?
","If the return value of the dropdown allows multiple selections, it will be in list format and isin effect. Since the expected pie chart is a single selection of date and time, the list format is not needed as an initial value. At the same time, the return value of the callback will be a single date and time data for conditional extraction.
+date_category = list(df['Date'].unique())
+
+from dash import Dash, dcc, html, Input, Output
+import plotly.express as px
+#from jupyter_dash import JupyterDash
+
+app = Dash(__name__)
+#app = JupyterDash(__name__)
+
+app.layout = html.Div([
+ html.H3('Daily Graph'),
+ dcc.Dropdown(id='date_drdn',
+ multi=False,
+ value= '02/01/2022',
+ options = [{'label':x, 'value':x} for x in date_category]
+ ),
+
+ dcc.Graph(id='pie-fig', figure={})
+])
+
+@app.callback(
+ Output('pie-fig', 'figure'),
+ Input('date_drdn', 'value'))
+def update_graph(selection):
+ # if selection:
+ dff = df[df['Date'] == selection]
+ #print(dff)
+ fig = px.pie(dff, values='Transactions', names='Product', color_discrete_sequence=px.colors.sequential.RdBu)
+ fig.update_traces(textinfo="label+value+percent").update_layout(title_x=0.5)
+ return fig
+
+if __name__ == '__main__':
+ app.run_server(debug=True)#, mode='inline'
+
+![]()
",python
+"Search by value of dict in arrayI have a large list of dicts, each dict has a token.
+large_list = [{"token": "4kj13", "value1": 10, "value2": 20},
+ {"token": "hm9gm", "value1": 15, "value2": 30}]
+
+I need to quickly find a dictionary by token, something like
+print(large_list["4kj13"]["value1"])
+
+Is there any elegant way to do it? I think I can create a dictionary token to index:
+token2index = {"4kj13": 0, "hm9gm": 1}
+
+But if there's a better solution, then I would be glad to know.
+I can't change the input format (json), though I can create some intermediate data.
+UPD: also the content of the dict is not simple, so the list can't be easily transformed to a table
+UPD2: tokens are unique
","Convert list of dict to dict
+d = {x["token"]: x for x in large_list}
+d["4kj13"]["value1"]
+# 10
+
",python
+"FastAPI response model list of json objectsI am using MongoDB and FastAPI but can't get my response for more than one document to render without an error, it's a lack of understanding on my part but no matter what I read, I can't seem to get to the bottom of it?
+models.py
+from pydantic import BaseModel, constr, Field
+
+ #Class for a user
+ class User(BaseModel):
+ username: constr(to_lower=True)
+ _id: str = Field(..., alias='id')
+ name: str
+ isActive : bool
+ weekPlan : str
+
+ #Example to provide on FastAPI Docs
+ class Config:
+
+ allow_population_by_field_name = True
+ orm_mode = True
+ schema_extra = {
+
+ "example": {
+ "name": "John Smith",
+ "username": "john@smith.com",
+ "isActive": "true",
+ "weekPlan": "1234567",
+ }
+ }
+
+routes.py
+from fastapi import APIRouter, HTTPException, status, Response
+
+from models.user import User
+from config.db import dbusers
+
+user = APIRouter()
+
+@user.get('/users', tags=["users"], response_model=list[User])
+ async def find_all_users(response: Response):
+ # Content-Range needed for react-admin
+ response.headers['Content-Range'] = '4'
+ response.headers['Access-Control-Expose-Headers'] = 'content-range'
+ users = (dbusers.find())
+ return users
+
+mongodb json data
+{
+ "_id" : ObjectId("62b325f65402e5ceea8a4b6f")
+ },
+ "name": "John Smith",
+ "isActive": true,
+ "weekPlan": "1234567"
+ },
+ {
+ "_id" : ObjectId("62b325f65402e5ceea9a3d4c"),
+ "username" : "john@smith.com",
+ "name" : "John Smith",
+ "isActive" : true,
+ "weekPlan" : "1234567"
+ }
+
+This is the error I get:
+ await self.app(scope, receive, send)
+ File "C:\Git2\thrive-app-react\backend\venv\lib\site-packages\starlette\routing.py", line 670, in __call__
+ await route.handle(scope, receive, send)
+ File "C:\Git2\thrive-app-react\backend\venv\lib\site-packages\starlette\routing.py", line 266, in handle
+ await self.app(scope, receive, send)
+ File "C:\Git2\thrive-app-react\backend\venv\lib\site-packages\starlette\routing.py", line 65, in app
+ response = await func(request)
+ File "C:\Git2\thrive-app-react\backend\venv\lib\site-packages\fastapi\routing.py", line 235, in app
+ response_data = await serialize_response(
+ File "C:\Git2\thrive-app-react\backend\venv\lib\site-packages\fastapi\routing.py", line 138, in serialize_response
+ raise ValidationError(errors, field.type_)
+pydantic.error_wrappers.ValidationError: 1 validation error for User
+response
+ value is not a valid list (type=type_error.list)
+
+Can anyone help?
","pymongo's find method returns a Cursor - you have to exhaust this iterator first, since Pydantic doesn't have any idea what it should do with a Cursor object.
+You can do this by giving it as an argument to list:
+@user.get('/users', tags=["users"], response_model=List[User])
+async def find_all_users(response: Response):
+ ...
+ return list(dbusers.find())
+
",python
+"Python: Behaviour of float memory addressingAll,
+When I execute the following code from inside Spyder IDE, I get the same ID of a, b, and the number 1000., yet when I execute the code from the Spyder console, I get different IDs (the ID of a is different from b). Floats are known to be immutable, yet they behave like mutable when executed by the spyder editor. Any idea why this is the case.
+a=1000.
+b=1000.
+print('id of a='+str(id(a)))
+print('id of b='+str(id(b)))
+print('id of 1.'+str(id(1000.)))
+
+Thanks
","This has nothing to do with Spyder, this is purely a Python optimisation, as kindly explained by @Barmar.
+Example 1: Line-based execution
+When the values are assigned for line-execute (even if the lines are grouped when executed), the following output is provided:
+>>> a = 1000.
+>>> b = 1000.
+>>> print(f'{id(a)=}\n{id(b)=}')
+
+id(a)=140093502852336
+id(b)=140093502851184
+
+Different IDs.
+Example 2: Function-based execution
+However, when the assignments are wrapped in a function (as shown below), Python performs optimisations which basically say: "Hey, I've got a variable for that value already, so I'll just use it again."
+def test():
+ a = 1000.
+ b = 1000.
+ print(f'{id(a)=}\n{id(b)=}')
+
+>>> test()
+id(a)=140093502851472
+id(b)=140093502851472
+
+Same IDs.
+
+Bytecode: Function-based execution
+As shown in the bytecode for the test() function, the same ID has been assigned to the variables, due to compiler optimisations.
+ 2 0 LOAD_CONST 1 (1000.0) # <-- Here (1)
+ 2 STORE_FAST 0 (a)
+
+ 3 4 LOAD_CONST 1 (1000.0) # <-- Here (1)
+ 6 STORE_FAST 1 (b)
+
+ 4 8 LOAD_GLOBAL 0 (print)
+ 10 LOAD_CONST 2 ('id(a)=')
+ 12 LOAD_GLOBAL 1 (id)
+ 14 LOAD_FAST 0 (a)
+ 16 CALL_FUNCTION 1
+ 18 FORMAT_VALUE 2 (repr)
+ 20 LOAD_CONST 3 ('\nid(b)=')
+ 22 LOAD_GLOBAL 1 (id)
+ 24 LOAD_FAST 1 (b)
+ 26 CALL_FUNCTION 1
+ 28 FORMAT_VALUE 2 (repr)
+ 30 BUILD_STRING 4
+ 32 CALL_FUNCTION 1
+ 34 POP_TOP
+ 36 LOAD_CONST 0 (None)
+ 38 RETURN_VALUE
+
",python
+"Should libraries be synchronized with a repository (eg GitHub)PyCharm offers to synchronize imported packages (eg openpyxl) with repositories.
+Is it good practice to synch these (even though they are imported standard packages)?
+Thanks
","The answer is no.
+A virtual environment need not be replicated as the packages and their version can be listed with the pop freeze command into a text file called requirements.txt that should be shared.
+Others can use this file to build up the libraries.
",python
+"Pandas apply().to_excel() got DataFrame is not callable empty_stock_list = [
+ {
+ 'row_index': <num>,
+ 'column_index': <num>
+ },
+ ...
+ ]
+
+ with pd.ExcelWriter(OUTPUT_FILE, engine='xlsxwriter') as writer:
+ df = pd.concat([header_row, data_price], ignore_index=False, sort=False).reset_index(drop=True)
+
+ df_color = df.copy()
+ df_color.iloc[:,:] = 'font-color: black'
+ for empty_stock in empty_stock_list:
+ df_color.iloc[empty_stock['row_index'], empty_stock['column_index']] = 'font-color: #FF0000'
+
+ df.style.apply(df_color, axis=None).\
+ to_excel(writer, sheet_name=sheet_name, index=False, header=None)
+
+I have this code above, but always get this error: TypeError: 'DataFrame' object is not callable. Basically what I'm trying to do is to make the cell color into a red color if a stock is empty (Based on the data row_index and column_index).
+Tried to follow the documentation, but I can't seem to make this right.
+Below is the traceback error messages:
+Traceback (most recent call last):
+ File "main.py", line 129, in <module>
+ df.style.apply(df_color, axis=None).\
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/formats/style.py", line 229, in to_excel
+ formatter.write(
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/formats/excel.py", line 734, in write
+ writer.write_cells(
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/excel/_xlsxwriter.py", line 212, in write_cells
+ for cell in cells:
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/formats/excel.py", line 688, in get_formatted_cells
+ for cell in itertools.chain(self._format_header(), self._format_body()):
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/formats/excel.py", line 590, in _format_regular_rows
+ for cell in self._generate_body(coloffset):
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/formats/excel.py", line 674, in _generate_body
+ styles = self.styler._compute().ctx
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/formats/style.py", line 625, in _compute
+ r = func(self)(*args, **kwargs)
+ File "/home/michaelharley/.local/lib/python3.8/site-packages/pandas/io/formats/style.py", line 642, in _apply
+ result = func(data, **kwargs)
+TypeError: 'DataFrame' object is not callable
+
+I'm using these dependencies:
+
+- python 3.8.0
+- pandas 1.1.3
+- xlrd 1.2.0
+- XlsxWriter 1.3.7
+
","I think you need create function and pass to Styler.apply and change font-color to color:
+def func(df):
+
+ df_color = pd.DataFrame('color: black', index=df.index, columns=df.columns)
+
+ for empty_stock in empty_stock_list:
+ i = empty_stock['row_index']
+ j = empty_stock['column_index']
+ df_color.iloc[i,j] = 'color: #FF0000'
+ return df_color
+
+
+with pd.ExcelWriter(OUTPUT_FILE, engine='xlsxwriter') as writer:
+ df = pd.concat([header_row, data_price],
+ ignore_index=False,
+ sort=False).reset_index(drop=True)
+
+ (df.style.apply(func, axis=None)
+ .to_excel(writer, sheet_name=sheet_name, index=False, header=None))
+
",python
+"How to increase AWS Sagemaker invocation time out while waiting for a responseI deployed a large 3D model to aws sagemaker. Inference will take 2 minutes or more. I get the following error while calling the predictor from Python:
+An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from model with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."'
+
+In Cloud Watch I also see some PING time outs while the container is processing:
+2020-10-07T16:02:39.718+02:00 2020/10/07 14:02:39 https://forums.aws.amazon.com/ 106#106: *251 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.32.0.2, server: , request: "GET /ping HTTP/1.1", upstream: "http://unix:/tmp/gunicorn.sock/ping", host: "model.aws.local:8080"
+
+How do I increase the invocation time out?
+Or is there a way to make async invocations to an sagemaker endpoint?
","It’s currently not possible to increase timeout—this is an open issue in GitHub. Looking through the issue and similar questions on SO, it seems like you may be able to use batch transforms in conjunction with inference.
+References
+https://stackoverflow.com/a/55642675/806876
+Sagemaker Python SDK timeout issue: https://github.com/aws/sagemaker-python-sdk/issues/1119
",python
+"For loop within a Pandas dataframe to add a new column after each iterationI have a dataframe that has a variety of properties on a dataset of buildings. These buildings are all assigned to a dwelling group (Apartment/ Semi detached house/ Detached house/ Terraced house) and a small area code. These buildings also have a year of construction column, however no unique identifier apart from their small area (circa 80 buildings).
+I want to write a for loop that groups these buildings into their dwelling group, and then break them down into their small area and assigns them individually the median year of construction for that dwelling group in that small area. For example, divide up all apartments in small area 12345, and assign them individually (in a new column) the median year of construction for apartments in that small area.
+So far geo_dwelling is a GeoDataFrame with columns;
+In [20]: geo_dwelling.head(5)
+
+Out[20]:
+cso_small_area Dublin Postcode Year of construction Year of construction range Dwelling type description Energy Rating ... height_ag height_bg floors_ag floors_bg category Dwelling Group
+7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.02 0 3 0 R Apartment
+7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.73 0 3 0 R Apartment
+7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.56 0 3 0 R Apartment
+7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.75 0 3 0 R Apartment
+7101 268109005 DUBLIN 1 2009.0 2005 onwards Mid floor apt. B3 ... 10.85 0 3 0 R Apartment
+geo_dwelling = geo_dropped[
+geo_dropped["Dwelling Group"].str.contains("Apartment", na=False)]
+
+geo_dwelling.groupby(["cso_small_area"])[["Year of construction"]].median()
+
+Any help is much appreciated!
","It's generally considered bad practise to create 'for' loops in a pandas dataframe (Takes a lot of time too!). I believe the answer to your question lies in this article:
+How to iterate over rows in a DataFrame in Pandas
",python
+"Getting the correct order of output with append in for loopI have a df in which I want to place column 1 under colum 0 and column 2 under column 1 and so on for n number of columns so that I have one series.
+Input:
+df = pd.DataFrame([['Adf', 'Bdf', 'Cdf','Ddf','Edf','Fdf','Gdf'],[45.1, 34.3, 23.1,67.1,45.4,78.2,85.1]] )
+tmp = pd.Series([], dtype=object)
+for i in range(df.shape[1]-1):
+ tmp=tmp.append(df.iloc[:][i].append(df.iloc[:][i+1]))
+
+Output:df
+ 0 1 2 3 4 5 6
+0 Adf Bdf Cdf Ddf Edf Fdf Gdf
+1 45.1 34.3 23.1 67.1 45.4 78.2 85.1
+
+Output: tmp: Almost correct but there are double entries
+0 Adf
+1 45.1
+0 Bdf
+1 34.3
+0 Bdf
+1 34.3
+0 Cdf
+1 23.1
+0 Cdf
+1 23.1
+0 Ddf
+1 67.1
+0 Ddf
+1 67.1
+0 Edf
+1 45.4
+0 Edf
+1 45.4
+0 Fdf
+1 78.2
+0 Fdf
+1 78.2
+0 Gdf
+1 85.1
+
+Desired Output:Created manually
+Adf
+45.1
+Bdf
+34.3
+Cdf
+23.1
+Ddf
+67.1
+Edf
+45.4
+Fdf
+78.2
+Gdf
+85.1
+
+I welcome any better approaches. Thanks
","Use DataFrame.unstack:
+s = df.unstack().reset_index(drop=True)
+print (s)
+0 Adf
+1 45.1
+2 Bdf
+3 34.3
+4 Cdf
+5 23.1
+6 Ddf
+7 67.1
+8 Edf
+9 45.4
+10 Fdf
+11 78.2
+12 Gdf
+13 85.1
+dtype: object
+
+Or convert all values to numpy array and then use np.ravel:
+s = pd.Series(np.ravel(df.to_numpy().T))
+print (s)
+0 Adf
+1 45.1
+2 Bdf
+3 34.3
+4 Cdf
+5 23.1
+6 Ddf
+7 67.1
+8 Edf
+9 45.4
+10 Fdf
+11 78.2
+12 Gdf
+13 85.1
+dtype: object
+
",python
+"Problem with counting the number of occurrences of an item of a listFor some reason, I'm not able to count the number of occurrences of an item in a list. The first three functions are needed to generate some data and I'm looking at the function distribution() where I have the problem. The list I'm looking at is finalstate i.e. I am not getting the right value for variable c. For example, as below, I'm counting number of occurrences of [0,0,0] in finalstate and it is supposed to be 1 and I'm getting 0. May I know where I went wrong?
+Output:
+ finalstate: [[1, 0, 0], [1, 1, 1], [0, 0, 1], [1, 1, 1], [0, 0, 1], [0, 1, 1], [1, 0, 0], [0, 0, 0], [1, 1, 0], [1, 0, 0]]
+ key: [0, 0, 0]
+ c: 0
+
+
+Code:
+ def generateAllBinaryStrings(n, arr, l, i):
+
+ if i == n:
+ l.append(arr[:])
+ return
+
+ arr[i] = 0
+ generateAllBinaryStrings(n, arr, l, i + 1)
+
+ arr[i] = 1
+ generateAllBinaryStrings(n, arr, l, i + 1)
+
+ return l
+
+
+ def dictionary(v):
+ d={}
+ for i in range(len(v)):
+ d[str(v[i])]=[]
+ temp=[]
+ for j in range(n):
+ temp=v[i][:]
+ if v[i][j]==1:
+ temp[j]=0
+ else:
+ temp[j]=1
+ d[str(v[i])].append(temp)
+ return d
+
+
+ def srw(d,n,t):
+ h=[[0 for i in range(n)]]
+ w=[1/(2*n) for i in range(n)]
+ w.append(0.5)
+ for i in range(t):
+ temp=d[str(h[-1])][:]
+ temp.append(h[-1])
+ h.append(random.choices(temp,weights=w)[-1])
+
+ return h
+
+
+ def distribution(d,n,t,num):
+ finalstate=[]
+ for i in range(num):
+ temp=srw(d,n,t)
+ finalstate.append(temp[-1])
+ print(finalstate)
+ Xt={}
+ for key in d:
+ c=finalstate.count(list(key))
+ print(c)
+ Xt[str(key)]=c/num
+ for key in d:
+ if key not in Xt:
+ Xt[key]=0
+
+ return Xt
+
+ import numpy as np
+ import matplotlib.pyplot as plt
+ import random
+ import collections as cs
+
+ time=40
+ numsim=10
+
+ #for n in range(5,11,5):
+ n=3
+ l = []
+ arr = [None] * n
+ vertices=generateAllBinaryStrings(n, arr, l, 0)
+ d=dictionary(vertices)
+ dist=distribution(d,n,time,numsim)
+
","The key is a string and you are searching for a list.
+Try this code:
+c=finalstate.count(eval(key)) # convert string to list
+
",python
+"Count values in Pandas data Frame -PythonI have a data set as such
+![]()
+For simplicity -Let's say I want to calculate the number of type of each manufacturer of the plane.
+![]()
+I want the output as such-
+BOEING-xxx
+EMBRAER-xxx
+MCDONNELL-XXX
+:
+:
+:
+so on
+
+How can I do this ? Please help me out with this.
","You can use dataframe['manufacturer'].value_counts() to get the result that you want;
+However, note that you have NaNs in your column; so prior to applying the function above, use:
+dataframe.dropna(subset=['manufacturer'],inplace=True)
+
+Summing it up:
+
+dataframe.dropna(subset=['manufacturer'],inplace=True)
+dataframe['manufacturer'].value_counts()
+
",python
+"Create columns in dataframe from list (Number of columns change)I'm working with a pandas dataframe and I have a problem.
+My input is a list and I don't know how many elements there are in the list, it could be anything from 1 to 5 or 6. I need to add new columns to the dataframe, one for each element in the list.
+Currently, I add comments to lines, but it doesn't work automatically. My code:
+list = ['banana', 'apple', 'kiwi'] (3 elements, so i comment 2 lines)
+
+df.loc[:, list[0]] = np.where(df['food'] == list[0], 1.0, 0.0)
+df.loc[:, list[1]] = np.where(df['food'] == list[1], 1.0, 0.0)
+df.loc[:, list[2]] = np.where(df['food'] == list[2], 1.0, 0.0)
+#df.loc[:, list[3]] = np.where(df['food'] == list[3], 1.0, 0.0)
+#df.loc[:, list[4]] = np.where(df['food'] == list[4], 1.0, 0.0)
+
+I would loke to have something that reads the number of elements in the list, and then creates the correct number of columns automatically, without # adding comments.
","I think in pandas is best avoid loops, so use get_dummies with filtered rows by Series.isin:
+L = ['banana', 'apple', 'kiwi']
+df1 = pd.get_dummies(df.loc[df['food'].isin(L), 'food'])
+
+Last add 0 rows and add to original use DataFrame.reindex with DataFrame.join:
+df = df.join(df1.reindex(df1.index, fill_value=0.0))
+
",python
+"Python: Capture the output of a REST API based DMS service post uploading documentI am trying to upload a file to a DMS via REST API. Each time I upload the file to DMS, an unique doc_id is generated which needs to be saved in DB.
+I am trying out the following code for the first part i.e. upload.
+def upload_sotr(filepath:str,file_name:str):
+ upload_url = 'dms_url_path'
+ f = open(os.path.join(filepath,file_name),'rb')
+ files = {"file":(os.path.join(filepath,file_name),f)}
+ resp = requests.post(url=url,files=files)
+ if resp.status_code==201:
+ print('Success!!')
+ ##Want to get the doc_id as shown below and return the same
+ return 'Success!!'
+else:
+ strg='Failure'
+ return strg
+
+However, I am not able to capture the doc_id string from upload_url post uploading the doc. Typically, doc_id is returned as
+{
+ doc_type: 'image',
+ doc_id: 'AAD3456Q77'
+}
+
+As indicated in the code, what trick I should do post print('Success!!') so that I get the doc_id?
","Ok, I found the trick!!
+I should use
+data = resp.json()
+doc_id = data['doc_id''
+return doc_id
+
+So the complete code would be:
+def upload_sotr(filepath:str,file_name:str):
+ upload_url = 'dms_url_path'
+ f = open(os.path.join(filepath,file_name),'rb')
+ files = {"file":(os.path.join(filepath,file_name),f)}
+ resp = requests.post(url=url,files=files)
+ if resp.status_code==201:
+ data = resp.json()
+ doc_id = data['doc_id']
+ return doc_id
+ else:
+ strg='Failure'
+ return strg
+
",python
+"Pandas Python highest 2 rows of every 3 and tabling the resultsSuppose I have the following dataframe:
+ . Column1 Column2
+ 0 25 1
+ 1 89 2
+ 2 59 3
+
+ 3 78 10
+ 4 99 20
+ 5 38 30
+
+ 6 89 100
+ 7 57 200
+ 8 87 300
+
+
+Im not sure if what I want to do is impossible or not. But I want to compare every three rows of column1 and then take the highest 2 out the three rows and assign the corresponding 2 Column2 values to a new column. The values in column 3 does not matter if they are joined or not. It does not matter if they are arranged or not for I know every 2 rows of column 3 belong to every 3 rows of column 1.
+ . Column1 Column2 Column3
+ 0 25 1 2
+ 1 89 2 3
+ 2 59 3
+ 3 78 10 20
+ 4 99 20 10
+ 5 38 30
+ 6 89 100 100
+ 7 57 200 300
+ 8 87 300
+
+
","You can use np.arange with np.repeat to create a grouping array which groups every 3 values.
+Then use GroupBy.nlargest then extract indices of those values using pd.Index.get_level_values, then assign them to Column3 pandas handles index alignment.
+n_grps = len(df)/3
+g = np.repeat(np.arange(n_grps), 3)
+
+idx = df.groupby(g)['Column1'].nlargest(2).index.get_level_values(1)
+vals = df.loc[idx, 'Column2']
+vals
+# 1 2
+# 2 3
+# 4 20
+# 3 10
+# 6 100
+# 8 300
+# Name: Column2, dtype: int64
+
+df['Column3'] = vals
+df
+ Column1 Column2 Column3
+0 25 1 NaN
+1 89 2 2.0
+2 59 3 3.0
+3 78 10 10.0
+4 99 20 20.0
+5 38 30 NaN
+6 89 100 100.0
+7 57 200 NaN
+8 87 300 300.0
+
+To get output like you mentioned in the question you have to sort and push NaN to last then you have perform this additional step.
+df['Column3'] = df.groupby(g)['Column3'].apply(lambda x:x.sort_values()).values
+
+ Column1 Column2 Column3
+0 25 1 2.0
+1 89 2 3.0
+2 59 3 NaN
+3 78 10 10.0
+4 99 20 20.0
+5 38 30 NaN
+6 89 100 100.0
+7 57 200 300.0
+8 87 300 NaN
+
",python
+"How to store result calcluated inside two for loops in np array?I want to iterate through an image and save the calculated distance beetween the restrictive (x,y) pixel and the point (300,600) in the numpy array np_dist. At the moment the result for all dist values is saved in one element of the array. How is it possible to fill the array storing one value per element?
+dist_arr = np.empty((width, height))
+for x in range(0, width):
+ for y in range(0, height):
+ pixel = (x, y)
+ dist = math.sqrt((300 - pixel[0])**2 + (600 - pixel[1])**2)
+ dist_arr[pixel[0], pixel[1]] = dist
+
+
","There's nothing with your loops:
+In [26]: width, height = 4,4
+ ...: dist_arr = np.empty((width, height))
+ ...: for x in range(0, width):
+ ...: for y in range(0, height):
+ ...: dist = math.sqrt((300 - x)**2 + (600 - y)**2)
+ ...: dist_arr[x, y] = dist
+ ...:
+In [27]: dist_arr
+Out[27]:
+array([[670.82039325, 669.92611533, 669.03213675, 668.1384587 ],
+ [670.37377634, 669.47890183, 668.58432527, 667.69004785],
+ [669.92835438, 669.03288409, 668.13771036, 667.24283436],
+ [669.48412976, 668.58806451, 667.6922944 , 666.79682063]])
+
+There are ways of doing this faster, but they work.
+Same values with whole-array numpy calculation:
+In [28]: np.sqrt((300-np.arange(4)[:,None])**2 + (600 - np.arange(4))**2)
+Out[28]:
+array([[670.82039325, 669.92611533, 669.03213675, 668.1384587 ],
+ [670.37377634, 669.47890183, 668.58432527, 667.69004785],
+ [669.92835438, 669.03288409, 668.13771036, 667.24283436],
+ [669.48412976, 668.58806451, 667.6922944 , 666.79682063]])
+
",python
+"Parsing XML files using element treeI Would like to parse an XML file in order to get the information as variables for further studies.
+One part of the XML looks like this:
+'''<SchedulingPeriod ID="sprint01" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="competition.xsd">
+ <StartDate>2010-01-01</StartDate>
+ <EndDate>2010-01-28</EndDate>
+ <Skills>
+ <Skill>Nurse</Skill>
+ </Skills>
+ <ShiftTypes>
+ <Shift ID="E">
+ <StartTime>06:30:00</StartTime>
+ <EndTime>14:30:00</EndTime>
+ <Description>Early</Description>
+ <Skills>
+ <Skill>Nurse</Skill>
+ </Skills>
+ </Shift>
+'''
+
+I cant extract all the informations from the root element 3 which is ShiftTypes. My code looks like this but unfortunately I cannot extract the information shift type ID and skills:
+import xml.etree.ElementTree as ET
+
+xmlfile = 'sprint01.xml'
+
+tree = ET.parse(xmlfile)
+root = tree.getroot()
+
+for x in root[3]:
+ print('StartTime: ', x.find('StartTime').text)
+ print('EndTime: ', x.find('EndTime').text)
+ print('Description', x.find('Description').text)
+
+Thank you and if you have suggestions on how to store these elements and would really appreciate it.
+Arthur
","Use the following code to extract ID attribute:
+import xml.etree.ElementTree as ET
+tree = ET.parse('a.xml')
+root = tree.getroot()
+for child in root:
+ if child.tag == 'ShiftTypes':
+ for i in child:
+ print ('Here is the ID: ', i.attrib)
+ for j in i:
+ if j.tag == 'StartTime':
+ print ('Here is StartTime:', j.text)
+ elif j.tag == 'EndTime':
+ print ('Here is EndTime:', j.text)
+ elif j.tag == 'Description':
+ print ('Here is Description:', j.text)
+
+Here is the ID: {'ID': 'E'}
+Here is StartTime: 06:30:00
+Here is EndTime: 14:30:00
+Here is Description: Early
+
+Here is a useful tutorial about parsing XML data:
+https://www.datacamp.com/community/tutorials/python-xml-elementtree
",python
+"How to remove more than one symbol from csvI'm trying to replace my old.csv data that looks like this: 6004387,6219127,'12524449',10340
+Into new.csv that should look like this: 6004387|6219127|12524449|10340
+What I get now is "['6004387'| '6219127'| ""'12524449'""| '10340']"
+How can I remove more than one symbol?
+import csv
+import string
+
+input_file = open('old.csv', 'r')
+output_file = open('new.csv', 'w')
+data = csv.reader(input_file)
+writer = csv.writer(output_file)
+specials = ','
+
+for row in data:
+ row = str(row)
+ new_row = str.replace(row,specials,'|')
+ writer.writerow(new_row.split(','))
+
+input_file.close()
+output_file.close()
+
","If you want to remove quote characters from input file, specify quotechar="'" in csv.reader. Also, for | delimiter in output file, specify delimiter='|' in csv.writer:
+import csv
+
+input_file = open('old.csv', 'r')
+output_file = open('new.csv', 'w')
+data = csv.reader(input_file, quotechar="'")
+writer = csv.writer(output_file, delimiter='|')
+
+for row in data:
+ writer.writerow(row)
+
+input_file.close()
+output_file.close()
+
+Creates new.csv:
+6004387|6219127|12524449|10340
+
",python
+"Numpy matrix looking like an array of lists?I'm trying to declare a 16x16 numpy matrix:
+P = np.array([[0.1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0,1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
+
+Outputs:
+ list([0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.0, 0.0, 0.8, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.4, 0.0, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.1, 0.3, 0.3, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.0, 0.0, 0.8, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.4, 0.0, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.1, 0.3, 0.3, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.0, 0.0, 0.8, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.4, 0.0, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0, 1, 0.3, 0.3, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.0, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.0, 0.0, 0.8, 0.2, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
+ list([0.4, 0.0, 0.0, 0.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])],
+ dtype=object)
+
+Why? I want a pure numpy matrix, not an array of lists...Sure it's something boneheaded I'm doing but for life of me can't figure it out...
","Well, lets investigate
+import numpy as np
+
+P = np.array([[0.1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0,1, 0.3, 0.3, 0.3, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.5, 0.5, 0.0, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.0, 0.0, 0.8, 0.2, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+ [0.4, 0.0, 0.0, 0.6, 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
+
+for i in P:
+ print(i.__len__())
+
+...
+16
+17 #oh?
+16
+16
+16
+
+also note that it gives us a warning:
+VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
+
+so there's a length problem!
+lets look closer
+[0,1, 0.3, 0.3, ...]
+[0.0, 0.5, 0.5, ...]
+
+the lengths are different, that's why you're getting the error
+maybe fix with [0.1, 0.3, 0.3, ...] to make it 16 x 16
",python
+"How to reassemble a pandas dataframe based on a non-ordered indexI have been working on an algorithm that separates rows in a dataframe based on one of the columns to treat them differently. The results should be reassembled afterwards. I need to make sure that the index is preserved and in the same order.
+Initially I thought I could just concatenate the results and then put the index in the same order. However, I cannot find an efficient way to do this. The best I could come up with is this:
+import pandas as pd
+
+# Input data with non-ordered index.
+input_data = pd.DataFrame({
+ 'type': ['a', 'a', 'b', 'c', 'a'],
+ 'value': [1, 2, 3, 4, 5],
+}, index=[0, 10, 40, 30, 20])
+# input_data:
+# type value
+# 0 a 1
+# 10 a 2
+# 40 b 3
+# 30 c 4
+# 20 a 5
+
+# Data separated into two and treated differently.
+data_a = input_data[input_data['type'] == 'a']
+data_a['result'] = data_a['value'].mean()
+data_b = input_data[input_data['type'] != 'a']
+data_b['result'] = data_b['value'].max()
+
+# Assemble output.
+output_data = (
+ pd.DataFrame(index=input_data.index) # correct index order
+ .merge(
+ pd.concat(
+ [data_a['result'], data_b['result']], axis=0
+ ), # wrong index order
+ how='left', left_index=True, right_index=True # 'left' preserves order
+ )
+)
+# output_data:
+# result
+# 0 2.666667
+# 10 2.666667
+# 40 4.000000
+# 30 4.000000
+# 20 2.666667
+
+Is there a more straight-forward way to do this? Or perhaps more efficient?
","Use DataFrame.reindex if index values are unique:
+output_data = pd.concat([data_a['result'], data_b['result']]).reindex(input_data.index)
+
",python
+"How to print a list comprehension with strings and integers and add a symbol only to the strings?I have a mixed list of integers and strings and I need to print '**' after strings only.
+data = [4, 'Fred', 'London', 34, 78, '@#=£', 89, 'Ice cream', 'Hamilton', 12, 'tingle']
+[print(data, end='**') for data in data if isinstance(data, str)]
+
+This is the output:
+Fred**London**@#=£**Ice cream**Hamilton**tingle**
+Desired output:
+4, Fred**, London**, 34, 78, @#=£**, 89, Ice Cream**, Hamilton**, 12, tingle**
","Your if is now filtering the list data. Put it before the for. Also, use str.join to join the various strings with , :
+data = [4, 'Fred', 'London', 34, 78, '@#=£', 89, 'Ice cream', 'Hamilton', 12, 'tingle']
+
+print( ', '.join('{}**'.format(item) if isinstance(item, str) else str(item) for item in data) )
+
+Prints:
+4, Fred**, London**, 34, 78, @#=£**, 89, Ice cream**, Hamilton**, 12, tingle**
+
",python
+"Python Pivot Table multi Sub-totals in columnI would like to be able to show the sub-total column from a multi-index pivot table in different ways for example, I would like to show the sum for a selected row and the max for another, is this possible?
+I managed to get half code correct but I am stuck in replicating the code without offsetting the previous one and I am not able to loop this code over.
+In my example I want to get the max value from Toyota and the sum value from Honda shown in the newly created Total column.
+cars = {'Brand': ['Honda','Toyota', 'Honda','Toyota'],
+ 'Target': ['A','B', 'A','B'],
+ 'Speed': [20, 80, 30 , 10],
+ 'Date' : ['13/02/2019', '18/02/2019', '18/02/2019', '13/02/2019']
+ }
+
+df = pd.DataFrame(cars)
+
+
+table = pd.pivot_table(df, values=['Speed'],
+ index=['Target', 'Brand'],
+ columns=['Date'],
+ fill_value=0, aggfunc=np.sum, dropna=True)
+table
+
+![]()
+the code craeted: (which works only for the last line as it overwrites the first one)
+table['Total'] = table.loc(axis=0)[:, ['Toyota']].max(axis=1)
+table['Total'] = table.loc(axis=0)[:, ['Honda']].sum(axis=1)
+
+Current output:
+![]()
+Disired Output:
+I would like to be able to see also the max value for Toyota which would be 80.
","Use slicers for set new values in both sides, here : means all values for levels:
+idx = pd.IndexSlice
+table.loc[idx[:, 'Toyota'], 'Total'] = table.max(axis=1)
+table.loc[idx[:, 'Honda'], 'Total'] = table.sum(axis=1)
+print (table)
+ Speed Total
+Date 13/02/2019 18/02/2019
+Target Brand
+A Honda 20 30 50.0
+B Toyota 10 80 80.0
+
+You can set and select in both sides:
+idx = pd.IndexSlice
+table.loc[idx[:, 'Toyota'], 'Total'] = table.loc[idx[:, 'Toyota'], :].max(axis=1)
+table.loc[idx[:, 'Honda'], 'Total'] = table.loc[idx[:, 'Honda'], :].sum(axis=1)
+
",python
+"How do I break up a squared term in sympyI am using python (3.7.3) with sympy (1.6.2) to store a function with squared terms and non-squared terms, with each term being the product of exactly two variables.
+For example,
+>> import sympy as sy
+>> x = sy.Symbol('x')
+>> y = sy.Symbol('y')
+>> F = x*x+x*y
+>> print(F)
+x**2+x*y
+
+I want to be able to iterate through the terms and get each operand.
+For example,
+terms = F.expand(basic=True).args
+for term in terms
+ (t0,t1) = term.args
+ print('t0:{}, t1:{}'.format(t0,t1))
+ # do some stuff using t0, t1
+
+This works for the x*y term, but not the x**2 term.
+>> print((x*y).args)
+(x,y)
+>> print((x**2).args) # I want this to be (x,x)
+(x,2)
+
+I tried running (x**2).expand(), but this appears to be the fully expanded version of the expression.
+My question is twofold:
+
+- is there a way to expand
x**2 so that it is stored as x*x?
+- is there a better way to go about getting each operand in each term than the for loop I show above?
+
","You could define a custom function that defactors in the way you want:
+def get_factors(expr):
+ if expr.func == sy.Mul:
+ return expr.args
+ elif expr.func == sy.Pow:
+ return tuple(expr.args[0] for _ in range(expr.args[1]))
+ else:
+ raise NotImplementedError()
+
+Usage:
+>>> a, b = terms
+>>> get_factors(a)
+(x, x)
+>>> get_factors(b)
+(x, y)
+
",python
+"Python print every possibility of setencesI wrote a simple sentence generator made out of lists.
+a_a = ['Hey, ', 'Hello, ', 'Hi, ']
+a_b = 'this is a random sentence! '
+a_c = ['Bye! ', 'Bye, bye! ', 'Goodbye! ']
+
+sentence = a_a[secrets.randbelow(3)] + a_b + a_c[secrets.randbelow(3)]
+
+The real code is much bigger. I need a way to print out every possible sentence or safe it to a text file. Can someone help me? I am lost.
","Using just plain python:
+a_a = ['Hey, ', 'Hello, ', 'Hi, ']
+a_b = 'this is a random sentence! '
+a_c = ['Bye! ', 'Bye, bye! ', 'Goodbye! ']
+
+for element_a in a_a:
+ for element_c in a_c:
+ print(element_a + a_b + element_c)
+
+Using itertools to get to a single loop that iterates over all combinations:
+import itertools
+
+a_a = ['Hey, ', 'Hello, ', 'Hi, ']
+a_b = 'this is a random sentence! '
+a_c = ['Bye! ', 'Bye, bye! ', 'Goodbye! ']
+
+for element_a, element_c in itertools.product(a_a, a_c):
+ print(element_a + a_b + element_c)
+
+If you want to save it to a file, replace the print statement with something else, like the command to write the combination to a file.
",python
+"RE: Transferring Python2 to Python3 on This Specific LineI am attempting to change this line to become acceptable by python3 from a python2 set of source:
+Here is the error:
+
+TypeError: unicode strings are not supported, please encode to bytes:
+'$PMTK251,9600*17\r\n'
+
+Can anyone tell my why this is this way or how I can change it to suit Python3 methods?
+It is a GPS set of source in Python2 that still works but I see that all ideas relating to Python2 will be gone from availability and/or is already pretty much done and gone.
+So, my ideas were to update that line and others.
+In python3, I receive errors relating to bytes and I have currently read about the idea of (arg, newline='') in source when attempting to make .csv files in Python3.
+I am still at a loss w/ how to incorporate Python3 in this specific line.
+I can offer more about the line or the rest of the source if necessary. I received this source from toptechboy.com. I do not think that fellow ever updated the source to work w/ Python3.
+class GPS:
+def __init__(self):
+ #This sets up variables for useful commands.
+ #This set is used to set the rate the GPS reports
+ UPDATE_10_sec = "$PMTK220,10000*2F\r\n" #Update Every 10 Seconds
+ UPDATE_5_sec = "$PMTK220,5000*1B\r\n" #Update Every 5 Seconds
+ UPDATE_1_sec = "$PMTK220,1000*1F\r\n" #Update Every One Second
+ UPDATE_200_msec = "$PMTK220,200*2C\r\n" #Update Every 200 Milliseconds
+ #This set is used to set the rate the GPS takes measurements
+ MEAS_10_sec = "$PMTK300,10000,0,0,0,0*2C\r\n" #Measure every 10 seconds
+ MEAS_5_sec = "$PMTK300,5000,0,0,0,0*18\r\n" #Measure every 5 seconds
+ MEAS_1_sec = "$PMTK300,1000,0,0,0,0*1C\r\n" #Measure once a second
+ MEAS_200_msec= "$PMTK300,200,0,0,0,0*2F\r\n" #Meaure 5 times a second
+ #Set the Baud Rate of GPS
+ BAUD_57600 = "$PMTK251,57600*2C\r\n" #Set Baud Rate at 57600
+ BAUD_9600 ="$PMTK251,9600*17\r\n" #Set 9600 Baud Rate
+ #Commands for which NMEA Sentences are sent
+ ser.write(BAUD_57600)
+ sleep(1)
+ ser.baudrate = 57600
+ GPRMC_ONLY = "$PMTK314,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0*29\r\n" #Send only the GPRMC Sentence
+ GPRMC_GPGGA = "$PMTK314,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0*28\r\n"#Send GPRMC AND GPGGA Sentences
+ SEND_ALL = "$PMTK314,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0*28\r\n" #Send All Sentences
+ SEND_NOTHING = "$PMTK314,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0*28\r\n" #Send Nothing
+
+...
+That is the GPS Class Mr. McWhorter wrote for a GPS Module in python2. I am trying to configure this python2 source into a workable python3 class.
+I am receiving errors like "needs to be bytes" and/or "cannot use bytes here".
+Anyway, if you are handy w/ Python3 and know where I am making mistakes on this source to transfer it over to Python3, please let me know. I have tried changing the source many times to accept bytes and to be read as a utf-string.
+Here: Best way to convert string to bytes in Python 3? <<< This seems like the most popular topic on this subject but it does not answer my question so far (I think).
","This line simply works when adding a b for bytes in front of the string...like so.
+(b'$PMTK251,9600*17\r\n')
+
+That should rid you of that error of TypeError: unicode strings are not supported, please encode to bytes:
",python
+"Button callback only works one time due to threadingI'm only able to use this button's callback once due to how I set up the 'command=' argument. I would like to be able to run this callback function again once it is finished, but I'm at a loss for how I can give the 'command=' argument a new thread object. I press it once and go through the process of the function, but once I press the button again after it is finished, I get 'RuntimeError: threads can only be started once.' Here is the code for the button and the callback:
+def ocr_callback():
+ no_file_to_save_to = False
+ try:
+ status_label.pack_forget()
+ for f in files: # files comes from another callback and is globally defined
+ name, extension = os.path.splitext(f)
+ if extension != '.pdf':
+ raise
+ if len(files) == 1:
+ new_file = filedialog.asksaveasfilename(filetypes=[('PDF', '.pdf')], defaultextension='.pdf')
+ if not new_file:
+ no_file_to_save_to = True
+ raise
+ try:
+ ocrmypdf.ocr(files[0], new_file, use_threads=True)
+ except ocrmypdf.exceptions.PriorOcrFoundError:
+ ocrmypdf.ocr(files[0], new_file, redo_ocr=True, use_threads=True)
+ elif len(files) > 1:
+ directory = filedialog.askdirectory()
+ if not directory:
+ no_file_to_save_to = True
+ raise
+ for f in files:
+ file_name = f.split('/')[-1]
+ try:
+ ocrmypdf.ocr(f, directory + '/' + file_name, use_threads=True)
+ except ocrmypdf.exceptions.PriorOcrFoundError:
+ ocrmypdf.ocr(f, directory + '/' + file_name, redo_ocr=True, use_threads=True)
+ status_label.config(text='Process Complete!', fg='blue')
+ status_label.pack(expand='yes')
+ except:
+ if no_file_to_save_to:
+ status_label.config(text='No file to save to. Process Cancelled', fg='blue')
+ else:
+ status_label.config(text='Error: One or more of the files could be corrupt.', fg='red')
+ status_label.pack(expand='yes')
+
+
+ocr_button = Button(root, text='OCR Files', relief='groove', bg='#5D1725', bd=0, width=scaled(20), fg='white',
+ command=threading.Thread(target=ocr_callback).start, state=DISABLED)
+ocr_button.pack()
+
+Any thoughts for how I could change it to make it work? I know this function must be threaded else the window will stall and freeze itself until it is finished. The 'ocr' function is what causes the bogging and necessity for the threading.
","You should probably start the thread from a launch function, instead of from inside the button command.
+Maybe like this:
+def launch_cmd(dummy=None):
+ threading.Thread(target=ocr_callback).start()
+
+...
+ocr_button = Button(root, text='OCR Files', relief='groove', bg='#5D1725',\
+ bd=0, width=scaled(20), fg='white', command=launch_cmd, state=DISABLED)
+
",python
+"Match text from beautifulsoup children in Python bs4In the below code I have Beautiful soup data "table":
+This is table.prettify
+<bound method Tag.prettify of <table border="0" cellpadding="14" cellspacing="0"
+
+
+rows = table.findChildren(['tr'])
+
+These are elements of rows:
+<tr><td style="width: 166.35pt; padding: 10.5pt;" width="222"><p class="MsoNormal" style="text-justify: inter-ideograph; text-align: justify;"
+
+
+i want to get something like:
+a1 = 'a1': l1'
+a2 = 'a2: r2'
+
+Can someone help me in how to cleanly go from "table" to get "a1" and "a2"? Any suggestions are welcome I can probably remove findchildren also.
","
+- Search for a string using the
text parameter.
+- Search for the next tag using
find_next(), which returns the first match.
+
+
+
+
from bs4 import BeautifulSoup
+
+soup = BeautifulSoup(your_html, 'html.parser')
+
+a1 = soup.find(text=lambda t: "3" in t)
+a2 = soup.find(text=lambda t: "4" in t)
+
+print(a1, a1.find_next(text=True))
+print(a2, a2.find_next(text=True))
+
+Output:
+3. Maturity Date: January 12, 2021
+4. Offering Amount: About 7,550 billion yen
+
",python
+"Error while running CNN for 1 dimensional data in RI am trying to run 1 dimensional CNN in R using keras package. I am trying to create one-dimensional Convolutional Neural Network (CNN) architecture with the following specification
+![]()
+library(keras)
+library(deepviz)
+
+#create a neural network with a convolutional layer and train the model
+model <- keras_model_sequential() %>%
+ layer_conv_1d(filters=32, kernel_size=4, activation="relu", input_shape=c(100, 10)) %>%
+ layer_max_pooling_1d(pool_size=2) %>%
+ layer_conv_1d(filters=64, kernel_size=4, activation="relu") %>%
+ layer_max_pooling_1d(pool_size=5) %>%
+ layer_conv_1d(filters=128, kernel_size=4, activation="relu") %>%
+ layer_max_pooling_1d(pool_size=5) %>%
+ layer_conv_1d(filters=256, kernel_size=4, activation="relu") %>%
+ layer_max_pooling_1d(pool_size=5) %>%
+ layer_dropout(rate=0.4) %>%
+ layer_flatten() %>%
+ layer_dense(units=100, activation="relu") %>%
+ layer_dropout(rate=0.2) %>%
+ layer_dense(units=1, activation="linear")
+
+But it is giving me following error
+
+Error in py_call_impl(callable, dots$args, dots$keywords) :
+ValueError: Negative dimension size caused by subtracting 4 from 1 for 'conv1d_20/conv1d' (op: 'Conv2D') with input shapes: [?,1,1,128], [1,4,128,256].
+
+How to solve the error?
+Another question, how to optimise the filters, kernel_size, pool_size, rate, units?
+In my question input_shape=c(100, 10) is an arbitrary value. How to decide about the input size?
","You have too many Max-Pooling layers, the max pooling layer reduces the dimension of the inputted vector by factor of its parameter.
+Try to reduce the pool_size parameters , or alternatively remove the last 2 max-pooling layers. A value you can try is pool_size=2 for all layers.
+As for the parameters you should learn of the meaning of them:
+Here you can find an explanation of the convolution layer and max pooling layer parameters like filters , kernel size and pool size:
+Convolutional layer
+The dropout layer is a regularization which maximize the effectiveness of the layer weights , every epoch it zeroes different percent (size of "rate" parameter) of the weights . the larger the rate - you have less overfitting but training time is longer. learn about it here:
+Dropout layer
+The units is the size of the Fully Connected layer.
+Fully Connected layer
+The input shape is a dimensions of your data, when the number of records does not count. In 1d vectors it is (N,C) when N is the vector length and C is number of channels you have, if you have 1 channel it is (N,1).
+In 2d vectors it is (Height,Width,Channels).
",python
+"How to limit lower error of bar plot to 0?I calculated the rttMeans and rttStds arrays. However, the value of rttStds makes the lower error less than 0.
+rttStds = [3.330311915835426, 3.3189677330174883, 3.3319538853150386, 3.325173772304221, 3.3374145232695813]
+
+How to set lower error to 0 instead of -#?
+The python bar plot code is bellow.
+![]()
+import numpy as np
+import matplotlib.pyplot as plt
+import seaborn as sns
+
+sns.set(rc={'figure.figsize':(18,16)},style='ticks',font_scale = 1.5,font='serif')
+
+N = 5
+ind = ['RSU1', 'RSU2', 'RSU3', 'RSU4', 'RSU5'] # the x locations for the groups
+width = 0.4 # the width of the bars: can also be len(x) sequence
+
+fig = plt.figure(figsize=(10,6))
+ax = fig.add_subplot(111)
+
+p1 = plt.bar(ind, rttMeans, width, yerr=rttStds, log=False, capsize = 16, color='green', hatch="/", error_kw=dict(elinewidth=3,ecolor='black'))
+plt.margins(0.01, 0)
+
+#Optional code - Make plot look nicer
+plt.xticks(rotation=0)
+i=0.18
+for row in rttMeans:
+ plt.text(i, row, "{0:.1f}".format(row), color='black', ha="center")
+ i = i + 1
+
+ax.spines['right'].set_visible(False)
+ax.spines['top'].set_visible(False)
+params = {'axes.titlesize':24,
+ 'axes.labelsize':24,
+ 'xtick.labelsize':28,
+ 'ytick.labelsize':28,
+ 'legend.fontsize': 24,
+ 'axes.spines.right':False,
+ 'axes.spines.top':False}
+plt.rcParams.update(params)
+
+plt.tick_params(axis="y", labelsize=28, labelrotation=20, labelcolor="black")
+plt.tick_params(axis="x", labelsize=28, labelrotation=20, labelcolor="black")
+
+plt.ylabel('RT Time (millisecond)', fontsize=24)
+plt.title('# Participating RSUs', fontsize=24)
+
+
+# plt.savefig('RSUs.pdf', bbox_inches='tight')
+plt.show()
+
","You can pass yerr as a pair [lower_errors, upper_errors] where you can control lower_errors :
+lowers = np.minimum(rttStds,rttMeans)
+p1 = plt.bar(ind, rttMeans, width, yerr=[lowers,rttStds], log=False, capsize = 16, color='green', hatch="/", error_kw=dict(elinewidth=3,ecolor='black'))
+
+Output:
+![]()
",python
+"Any easy way to transform a missing number sequence to its range?Suppose I have a list that goes like :
+'''
+[1,2,3,4,9,10,11,20]
+'''
+I need the result to be like :
+'''
+[[4,9],[11,20]]
+'''
+I have defined a function that goes like this :
+def get_range(lst):
+i=0
+seqrange=[]
+for new in lst:
+ a=[]
+ start=new
+ end=new
+ if i==0:
+ i=1
+ old=new
+ else:
+ if new - old >1:
+ a.append(old)
+ a.append(new)
+ old=new
+ if len(a):
+ seqrange.append(a)
+return seqrange
+
+Is there any other easier and efficient way to do it? I need to do this in the range of millions.
","You can use numpy arrays and the diff function that comes along with them. Numpy is so much more efficient than looping when you have millions of rows.
+
+Slight aside:
+Why are numpy arrays so fast? Because they are arrays of data instead of arrays of pointers to data (which is what Python lists are), because they offload a whole bunch of computations to a backend written in C, and because they leverage the SIMD paradigm to run a Single Instruction on Multiple Data simultaneously.
+
+Now back to the problem at hand:
+The diff function gives us the difference between consecutive elements of the array. Pretty convenient, given that we need to find where this difference is greater than a known threshold!
+import numpy as np
+
+threshold = 1
+arr = np.array([1,2,3,4,9,10,11,20])
+
+deltas = np.diff(arr)
+# There's a gap wherever the delta is greater than our threshold
+gaps = deltas > threshold
+gap_indices = np.argwhere(gaps)
+
+gap_starts = arr[gap_indices]
+gap_ends = arr[gap_indices + 1]
+
+# Finally, stack the two arrays horizontally
+all_gaps = np.hstack((gap_starts, gap_ends))
+print(all_gaps)
+# Output:
+# [[ 4 9]
+# [11 20]]
+
+You can access all_gaps like a 2D matrix: all_gaps[0, 1] would give you 9, for example. If you really need the answer as a list-of-lists, simply convert it like so:
+all_gaps_list = all_gaps.tolist()
+print(all_gaps_list)
+# Output: [[4, 9], [11, 20]]
+
+
+Comparing the runtime of the iterative method from @happydave's answer with the numpy method:
+import random
+import timeit
+
+import numpy
+
+def gaps1(arr, threshold):
+ deltas = np.diff(arr)
+ gaps = deltas > threshold
+ gap_indices = np.argwhere(gaps)
+ gap_starts = arr[gap_indices]
+ gap_ends = arr[gap_indices + 1]
+ all_gaps = np.hstack((gap_starts, gap_ends))
+ return all_gaps
+
+def gaps2(lst, thr):
+ seqrange = []
+ for i in range(len(lst)-1):
+ if lst[i+1] - lst[i] > thr:
+ seqrange.append([lst[i], lst[i+1]])
+ return seqrange
+
+test_list = [i for i in range(100000)]
+for i in range(100):
+ test_list.remove(random.randint(0, len(test_list) - 1))
+
+test_arr = np.array(test_list)
+
+# Make sure both give the same answer:
+assert np.all(gaps1(test_arr, 1) == gaps2(test_list, 1))
+
+t1 = timeit.timeit('gaps1(test_arr, 1)', setup='from __main__ import gaps1, test_arr', number=100)
+t2 = timeit.timeit('gaps2(test_list, 1)', setup='from __main__ import gaps2, test_list', number=100)
+
+print(f"t1 = {t1}s; t2 = {t2}s; Numpy gives ~{t2 // t1}x speedup")
+
+On my laptop, this gives:
+t1 = 0.020834800001466647s; t2 = 1.2446780000027502s; Numpy gives ~59.0x speedup
+
+My word that's fast!
",python
+"Adding legend information to matplotlib plotI have created a confidence interval plot which is working exactly how I want:
+month = ['Nov-20', 'Dec-20', 'Jan-21', 'Feb-21', 'Mar-21', 'Apr-21', 'May-21', 'Jun-21', 'Jul-21', 'Aug-21', 'Sep-21', 'Oct-21']
+x = [0.85704744, 0.74785299, 0.68103776, 0.69793547, 0.8396294 ,
+ 0.25560889, 0.37400785, 0.00742866, 0.84700224, 0.95142221,
+ 0.08544432, 0.09068883]
+y = [0.09448781, 0.69683102, 0.96261431, 0.93635227, 0.31503366,
+ 0.38335671, 0.24244469, 0.36712811, 0.22270387, 0.01506295,
+ 0.78433 , 0.38408096]
+z = [0.84585527, 0.59615266, 0.60263581, 0.26366399, 0.42948978,
+ 0.18138516, 0.54841131, 0.65201558, 0.03089001, 0.20581638,
+ 0.57586628, 0.33622286]
+
+fig, ax = plt.subplots(figsize=(17,8))
+ax.plot(month, z)
+ax.fill_between(month, x, y, color='b', alpha=.3)
+ax.hlines(y=0.50, xmin=0, xmax=(len(month)), colors='orange', linestyles='--', lw=2, label="Target: 50%")
+plt.xlabel('Month')
+plt.ylabel('Target %')
+plt.rcParams["font.size"] = "20"
+plt.ylim((0.1, 1.0))
+plt.legend(bbox_to_anchor=(1.04,0.5), loc="center left", borderaxespad=0)
+plt.title("Target Forecast Nov20 - Nov21")
+
+plt.show()
+plt.close()
+
+![]()
+However, I want to add the following to the legend:
+
+- An indicator that the blue line is the "probable forecast"
+- An indicator that the blue
fill_between is the confidence interval
+
+I did read this matplotlib documentation, and so I tried:
+fig, ax = plt.subplots(figsize=(17,8))
+prob, = ax.plot(month, z)
+btwn, = ax.fill_between(month, x, y, color='b', alpha=.3)
+tgt, = ax.hlines(y=0.50, xmin=0, xmax=(len(month)), colors='orange', linestyles='--', lw=2, label="Target: 50%")
+plt.xlabel('Month')
+plt.ylabel('Target %')
+plt.rcParams["font.size"] = "20"
+plt.ylim((0.1, 1.0))
+plt.legend([prob, btwn, tgt], ['Probable', 'Confidence Interval', 'Target'])
+plt.title("Target Forecast Nov20 - Nov21")
+
+plt.show()
+plt.close()
+
+But it ends in a TypeError:
+---------------------------------------------------------------------------
+TypeError Traceback (most recent call last)
+<ipython-input-61-3ef952c0fc7f> in <module>
+ 1 fig, ax = plt.subplots(figsize=(17,8))
+ 2 prob, = ax.plot(month, z)
+----> 3 btwn, = ax.fill_between(month, x, y, color='b', alpha=.3)
+ 4 tgt, = ax.hlines(y=0.50, xmin=0, xmax=(len(month)), colors='orange', linestyles='--', lw=2, label="Target: 50%")
+ 5 plt.xlabel('Month')
+
+TypeError: 'PolyCollection' object is not iterable
+
+How can I add these things to the legend?
","The matplotlib documentation often suggests to use proxy artists.
+Otherwise, in your case, you can just add the label argument and name it the way you want, and the legend should be updated automatically.
+In your case:
+ax.plot(month, z, label="Probable Forecast")
+ax.fill_between(month, x, y, color='b', alpha=.3, label="Confidence Interval")
+
+should work.
",python
+"Average value and sum of strings for each dayI have a dataframe with 3 columns. I am using python / pandas.
+ date id my_value1 my_value2
+0 31.07.20 128909 0.098333 positive
+1 31.07.20 128914 0.136364 positive
+3 31.07.20 853124 -0.025000 negative
+4 30.07.20 123456 -1.000000 neutral
+...
+
+The first column contains the date (can be parsed to any other form) with days from 06.02.20 to 31.07.20 with some days missing. As you can see each day appears several times.
+The column my_value1 contains a float number between 1 and -1.
+The column my_value2 contains either the string "postive", "negative" or "neutral".
+What I want is a new dataframe containing the average value of "my_value1" for each day and the sum of each value of "my_value2" which could look like this:
+ date average_value1 sum_positive sum_negative sum_neutral
+0 31.07.20 0.1 1532 2153 5321
+1 30.07.20 0.2 2153 5321 1532
+3 29.07.20 -0.3 1234 1234 1234
+...
+
+Appreciate any help!
","If the original DataFrame's index is unimportant, you could do
+encoded_df = pd.get_dummies(df, prefix="", prefix_sep="", columns=["my_value2"])
+output = encoded_df.groupby("date").agg(
+ average_value1 = pd.NamedAgg("my_value1", "mean"),
+ sum_positive = pd.NamedAgg("positive", "sum"),
+ sum_negative = pd.NamedAgg("negative", "sum"),
+ sum_neutral = pd.NamedAgg("neutral", "sum")
+).reset_index()
+
+Output:
+ date average_value1 sum_positive sum_negative sum_neutral
+0 30.07.20 -1.000000 0 0 1
+1 31.07.20 0.069899 2 1 0
+
",python
+"DJANGO EMAIL CONFIRMATION: [WinError 10061] No connection could be made because the target machine actively refused itI'm trying to send an email verification code for account confirmation.
+Now, the thing is that I am trying to send an email to myself first, and my other email. I have turned off my antivirus, so that shouldn't be a problem. Other than this I can't figure our what I did wrong that it is not sending emails on the gmail account. Please point out what I'm doing wrong. I even applied all the fixes mentioned by this thread.
+views.py
+from django.core.mail import send_mail
+from django.shortcuts import render,redirect
+from django.contrib import messages,auth
+from django.contrib.auth.models import User # this table already exists in django we import it
+from django.contrib.auth import authenticate, login
+from django.conf import settings
+from django.core.mail import EmailMessage
+
+def register(request):
+ if request.method=='POST':
+ fname = request.POST['fname']
+ lname=request.POST['lname'] #request.post[id of the input field]
+ email = request.POST['email']
+ password = request.POST['pass']
+ password2 = request.POST['confirmpass']
+ agree=request.POST.get('agree')
+ if fname == '':
+ messages.warning(request, 'Please enter First name!')
+ return redirect('register')
+
+ if lname == '':
+ messages.warning(request, 'Please enter Last name!')
+ return redirect('register')
+
+ if email == '':
+ messages.warning(request, 'Please enter Email!')
+ return redirect('register')
+
+ if password == '':
+ messages.warning(request, 'Please enter Password!')
+ return redirect('register')
+
+ if password2 == '':
+ messages.warning(request, 'Please enter Confirm Password!')
+ return redirect('register')
+ if ('agree' not in request.POST):
+ messages.warning(request, 'Please agree to our terms and conditions!')
+ return redirect('register')
+
+ if password==password2:
+
+ if User.objects.filter(username=email):
+ messages.error(request,"Email Already Exists")
+ return redirect('register')
+ else:
+ user=User.objects.create_user(username=email,first_name=fname,last_name=lname,password=password) #these are postgres first_name,last_name
+ user.save()
+ messages.success(request,"Successfully Registered")
+ subject = 'Site Contact Form'
+ contact_message = "%s : %s via %s "%(fname, lname, email)
+ from_email = settings.EMAIL_HOST_USER
+ to_email = [from_email, 'myotheremail@gmail.com']
+ send_mail(subject, contact_message, from_email, to_email, fail_silently=False)
+
+ return redirect('login')
+
+
+
+ else:
+ messages.error(request,"Passwords don't match")
+ return redirect('register')
+ else:
+ return render(request, 'signupdiv.html')
+
+settings.py
+EMAIL_HOSTS = 'smtp.gmail.com'
+EMAIL_HOST_USER = 'myemail@gmail.com'
+EMAIL_HOST_PASSWORD = '*****'
+EMAIL_PORT = 587
+EMAIL_USE_TLS = True
+EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
+DEFAULT_FROM_EMAIL = EMAIL_HOST_USER
+
+urls.py
+from django.urls import path
+from . import views
+
+urlpatterns = [
+ path('login', views.login, name='login'),
+ path('register', views.register, name='register'),
+ path('logout', views.logout, name='logout'),
+]
+
+When I register, the data successfuly stores into the database (as expected). But it gives error on send_email(params) from view.py of the code. Following is the error:
+ConnectionRefusedError at /account/register
+[WinError 10061] No connection could be made because the target machine actively refused it
+Request Method: POST
+Request URL: http://127.0.0.1:8000/account/register
+Django Version: 3.1.2
+Exception Type: ConnectionRefusedError
+Exception Value:
+[WinError 10061] No connection could be made because the target machine actively refused it
+Exception Location: C:\Users\User\AppData\Local\Programs\Python\Python37\lib\socket.py, line 716, in create_connection
+Python Executable: C:\Users\User\Desktop\pfa\venv\Scripts\python.exe
+Python Version: 3.7.5
+Python Path:
+['C:\\Users\\User\\Desktop\\pfa',
+ 'C:\\Users\\User\\AppData\\Local\\Programs\\Python\\Python37\\python37.zip',
+ 'C:\\Users\\User\\AppData\\Local\\Programs\\Python\\Python37\\DLLs',
+ 'C:\\Users\\User\\AppData\\Local\\Programs\\Python\\Python37\\lib',
+ 'C:\\Users\\User\\AppData\\Local\\Programs\\Python\\Python37',
+ 'C:\\Users\\User\\Desktop\\pfa\\venv',
+ 'C:\\Users\\User\\Desktop\\pfa\\venv\\lib\\site-packages']
+Server time: Sun, 25 Oct 2020 23:17:08 +0000
+
","You missed a little bit trivial thing.
+in settings.py
+It should be EMAIL_HOST = 'smtp.gmail.com'
+Not EMAIL_HOSTS = 'smtp.gmail.com'
",python
+"discord.py (rewrite) How do I make a command to a specific channelI need to make a command that is narrowed to only a specific channel. It sends a random picture of a car, but I have a channel called 'Pics' that i specifically need to narrow the command to. Can you help me out?
+@client.command()
+async def car(ctx):
+ pictures = [
+ 'https://car-images.bauersecure.com/pagefiles/78294/la_auto_show_11.jpg',
+ 'http://www.azstreetcustom.com/uploads/2/7/8/9/2789892/az-street-custom-gt40-2_orig.jpg',
+ 'http://tenwheel.com/imgs/a/b/l/t/z/1967_firebird_1968_69_70_2000_camaro_blended_custom_supercharged_street_car_1_lgw.jpg',
+ 'https://rthirtytwotaka.files.wordpress.com/2013/06/dsc_0019.jpg',
+ 'http://speedhunters-wp-production.s3.amazonaws.com/wp-content/uploads/2008/06/fluke27.jpg',
+ 'https://i.ytimg.com/vi/pCt0KXC1tng/maxresdefault.jpg',
+ 'https://i2.wp.com/www.tunedinternational.com/featurecars/dorift/02.jpg'
+ ]
+ channel = discord.utils.get()
+ if channel == 705161333972140072:
+ await ctx.channel.purge(limit=1)
+ await ctx.send(f'{random.choice(pictures)}')
+
","Replace with
+channel = bot.get_channel(705161333972140072)
+if ctx.channel == channel:
+ await ctx.channel.purge(limit=1)
+ await ctx.send(f'{random.choice(pictures)}')
+
+Remove from your code
+channel = discord.utils.get()
+if channel == 705161333972140072:
+ await ctx.channel.purge(limit=1)
+ await ctx.send(f'{random.choice(pictures)}')
+
+Actually you were making a basic syntax error !
",python
+"How to convert positive numbers to negative in Python?I know that abs() can be used to convert numbers to positive, but is there somthing that does the opposite?
+I have an array full of numbers which I need to convert to negative:
+array1 = []
+arrayLength = 25
+for i in arrayLength:
+ array1.append(random.randint(0, arrayLength)
+
+I thought perhaps I could convert the numbers as they're being added, not after the array is finished. Anyone knows the code for that?
+Many thanks in advance
","If you want to force a number to negative, regardless of whether it's initially positive or negative, you can use:
+ -abs(n)
+
+Note that integer 0 will remain 0.
",python
+"Error while obtaining start requests with ScrapyI am having some trouble trying to scrape through these 2 specific pages and don't really see where the problem is. If you have any ideas or advices I am all ears !
+Thanks in advance !
+import scrapy
+
+
+class SneakersSpider(scrapy.Spider):
+ name = "sneakers"
+
+ def start_requests(self):
+ headers = {'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'}
+ urls = [
+ #"https://stockx.com/fr-fr/retro-jordans",
+ "https://stockx.com/fr-fr/retro-jordans?page=2",
+ "https://stockx.com/fr-fr/retro-jordans?page=3",
+ ]
+ for url in urls:
+ yield scrapy.Request(url = url, callback =self.parse ,headers = headers)
+
+ def parse(self,response):
+ page = response.url.split("=")[-1]
+ filename = f'sneakers-{page}.html'
+ with open(filename, 'wb') as f:
+ f.write(response.body)
+ self.log(f'Saved file {filename}')
+
+
+
+
+
+
","Looking at the traceback always helps. You should see something like this in your spider's output:
+Traceback (most recent call last):
+ File "c:\program files\python37\lib\site-packages\scrapy\core\engine.py", line 127, in _next_request
+ request = next(slot.start_requests)
+ File "D:\Users\Ivan\Documents\Python\a.py", line 15, in start_requests
+ yield scrapy.Request(url = url, callback =self.parse ,headers = headers)
+ File "c:\program files\python37\lib\site-packages\scrapy\http\request\__init__.py", line 39, in __init__
+ self.headers = Headers(headers or {}, encoding=encoding)
+ File "c:\program files\python37\lib\site-packages\scrapy\http\headers.py", line 12, in __init__
+ super(Headers, self).__init__(seq)
+ File "c:\program files\python37\lib\site-packages\scrapy\utils\datatypes.py", line 193, in __init__
+ self.update(seq)
+ File "c:\program files\python37\lib\site-packages\scrapy\utils\datatypes.py", line 229, in update
+ super(CaselessDict, self).update(iseq)
+ File "c:\program files\python37\lib\site-packages\scrapy\utils\datatypes.py", line 228, in <genexpr>
+ iseq = ((self.normkey(k), self.normvalue(v)) for k, v in seq)
+ValueError: too many values to unpack (expected 2)
+
+As you can see, there is a problem in the code that handles request headers.
+headers is a set in your code; it should be a dict instead.
+This works without a problem:
+headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'}
+
+Another way to set a default user agent for all requests is using the USER_AGENT setting.
",python
+"Scrape highchart into pythonCan anyone tell me how can I extract the highchart data from the following link into python?
+https://www.ree.es/en/datos/generation/generation-structure
","Try below approach using python - requests simple, straightforward, reliable, fast and less code is required when it comes to requests. I have fetched the API URL from website itself after inspecting the network section of google chrome browser.
+What exactly below script is doing:
+
+First it will take the API URL which is created using dynamic parameters(all in caps) and do GET request. URL is dynamic you can pass any valid value in the params and the URL is created for you every time you want to fetch something from the chart.
+
+After getting the data script will parse the JSON data using json.loads library.
+
+Finally it will iterate all over the list of attributes and different values of the chart for ex:- Title, Type, Color, Last updates, percentage etc. you can modify these attributes as per your need.
+import json
+import requests
+from urllib3.exceptions import InsecureRequestWarning
+requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
+
+def scrape_chart_data():
+#### Dynamic Paramters######
+START_DATE = '2020-10-22T00:00'
+END_DATE = '2020-10-29T23:59'
+TIME_TRUNC = 'day'
+CACHED = 'true'
+SYSTEM_ELECTRIC = 'nacional'
+
+URL = 'https://apidatos.ree.es/en/datos/generacion/estructura-generacion?start_date=' + START_DATE + '&end_date=' + END_DATE + '&time_trunc=' + TIME_TRUNC + \
+'&cached=' + CACHED + '&systemElectric=' + SYSTEM_ELECTRIC # Dynamic URL created using params
+
+response = requests.get(URL,verify = False) # GET API request
+result = json.loads(response.text) # Parse JSON data
+extracted_chart_data = result['included'] # extracted data using GET API call
+
+for idx in range(len(extracted_chart_data)): # iterate over the data and print attributes and values
+ print('-' * 100)
+ attributes = extracted_chart_data[idx]['attributes'] #attributes
+ values = extracted_chart_data[idx]['attributes']['values'] #values
+ print('Type : ', attributes['type'])
+ print('Title : ', attributes['title'])
+ print('Color : ', attributes['color'])
+ print('Last Update : ', attributes['last-update'])
+ print('Magnitude : ', attributes['magnitude'])
+ print('-' * 50 + ' Values of ' + attributes['title'] + ' ' + '-' * 50)
+ for val in range(len(values)):
+ print('Date and Time : ', values[val]['datetime'])
+ print('Percentage : ', values[val]['percentage'])
+ print('Value : ', values[val]['value'])
+ print('-' * 100)
+
+scrape_chart_data()
+
+
+
",python
+"Execute python program with multiple files - Python - BashI have hundreds of XML files and I would like to parse it into CSV files. I already code this program.
+To execute the python program I use this command (on VScode MS):
+python ConvertXMLtoCSV.py -i Alarm120.xml -o Alarm120.csv
+
+My question is, how change this script to integrate a sort of for loop to execute this program for each xml files ?
+
+UPDATE
+If my files and folders are organized like in the picture:
+![]()
+I tried this and execute the file .bat in windows10 but it does nothing:
+#!/bin/bash
+for xml_file in XML_Files/*.xml
+do
+ csv_file=${xml_file/.xml/.csv}
+ python ConvertXMLtoCSV.py -i XML_Files/$xml_file -o CSV_Files/$csv_file
+done
+
","After discussion, it appears that you wish to use an external script so as to leave the original ConvertXMLtoCSV.py script unmodified (as required by other projects), but that although you tagged bash in the question, it turned out that you were not in fact able to use bash to invoke python when you tried it in your setup.
+This being the case, it is possible to adapt Rolv Apneseth's answer so that you do the looping in Python, but inside a separate script (let's suppose that this is called convert_all.py), which then runs the unmodified ConvertXMLtoCSV.py as an external process. This way, the ConvertXMLtoCSV.py will still be set up to process only one file each time it is run.
+To call an external process, you could either use os.system or subprocess.Popen, so here are two options.
+Using os.system:
+import os
+import sys
+
+directory_path = sys.argv[1]
+
+for file in os.listdir(directory_path):
+ if file.endswith(".xml"):
+ csv_name = file.replace(".xml", ".csv")
+ os.system(f'python ConvertXMLtoCSV.py -i {file} -o {csv_name}')
+
+note: for versions of python too old to support f-strings, that last line could be changed to
+ os.system('python ConvertXMLtoCSV.py -i {} -o {}'.format(file,csv_name))
+
+Using subprocess.Popen:
+import subprocess
+import sys
+
+directory_path = sys.argv[1]
+
+for file in os.listdir(directory_path):
+ if file.endswith(".xml"):
+ csv_name = file.replace(".xml", ".csv")
+ p = subprocess.Popen(['python', 'ConvertXMLtoCSV.py',
+ '-i', file,
+ '-o', csv_name])
+ p.wait()
+
+You could then run it using some command such as:
+python convert_all.py C:/Users/myuser/Desktop/myfolder
+
+or whatever the folder is where you have the XML files.
",python
+"Is there *any* solution to packaging a python app that uses cppyy?I'm no novice when creating cross-platform runtimes of my python desktop apps. I create various tools for my undergraduates using mostly pyinstaller, cxfreeze, sometimes fbs, and sometimes briefcase. Anyone who does this one a regular basis knows that there are lots of quirks and adjustments needed to target Linux, windows, and macos when using arbitrary collections of python modules, but I've managed to figure everything out until now.
+I have a python GUI app that uses a c++ library that is huge and ever-changing, so I can't just re-write it in python. I've successfully written python code that uses the c++ library using the amazing (and possibly magical) library called cppyy that allows you to run c++ code from python without hardly any effort. Everything runs great on Linux, mac, and windows, but I cannot get it packaged into runtimes and I've tried all the systems above. All of them have no problem producing the runtimes (i.e., no errors), but they fail when you run them. Essentially they all give some sort of error about not being able to find cppyy-backend (e.g., pyinstaller and fbs which uses pyinstaller gives this message when you run the binary):
+/home/nogard/Desktop/cppyytest/target/MyApp/cppyy_backend/loader.py:113: UserWarning: No precompiled header available ([Errno 2] No such file or directory: '/home/nogard/Desktop/cppyytest/target/MyApp/cppyy_backend'); this may impact performance.
+Traceback (most recent call last):
+ File "main.py", line 5, in <module>
+ File "<frozen importlib._bootstrap>", line 971, in _find_and_load
+ File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
+ File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
+ File "/home/nogard/Desktop/cppyytest/venv/lib/python3.6/site-packages/PyInstaller/loader/pyimod03_importers.py", line 628, in exec_module
+ exec(bytecode, module.__dict__)
+ File "cppyy/__init__.py", line 74, in <module>
+ File "<frozen importlib._bootstrap>", line 971, in _find_and_load
+ File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
+ File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
+ File "/home/nogard/Desktop/cppyytest/venv/lib/python3.6/site-packages/PyInstaller/loader/pyimod03_importers.py", line 628, in exec_module
+ exec(bytecode, module.__dict__)
+ File "cppyy/_cpython_cppyy.py", line 20, in <module>
+ File "cppyy_backend/loader.py", line 74, in load_cpp_backend
+RuntimeError: could not load cppyy_backend library
+[11195] Failed to execute script main
+
+I'm really stumped. Usually, you install cppyy with pip, which installs cppyy-backend and other packages. I've even used the cppyy docs methods to compile each dependency as well as cppyy, but the result is the same.
+I'll use any build system that works...has anyone had success? I know I could use docker, but I tried this before and many of my students freaked out at docker asking them to change their bios settings to support virtualization So I'd like to use a normal packaging system that produces some sort of runnable binary.
+If you know how to get pyinstaller, cxfreeze, fbs, or briefcase to work with cppyy (e.g, if you know how to deal with the error above), please let me know. However, if you've gotten a cppyy app packaged with some other system, let me know and I'll use that one.
+If you're looking for some code to run, I've been testing out packaging methods using this minimal code:
+import cppyy
+
+print('hello world from python\n')
+
+cppyy.cppexec('''
+#include <string>
+using namespace std;
+string mystring("hello world from c++");
+std::cout << mystring << std::endl;
+''')
+
","EDIT: figured out the pyinstaller hooks; this should all be fully automatic once released
+With the caveat that I have no experience whatsoever with packaging run-times, so I may be missing something obvious, but I've just tried pyinstaller, and the following appears to work.
+First, saving your script above as example.py, then create a spec file:
+$ pyi-makespec example.py
+
+Then, add the headers and libraries from cppyy_backend as datas (skipping the python files, which are added by default). The simplest seems to be to pick up all directories from the backend, so change the generated example.spec by adding at the top:
+def backend_files():
+ import cppyy_backend, glob, os
+
+ all_files = glob.glob(os.path.join(
+ os.path.dirname(cppyy_backend.__file__), '*'))
+
+ def datafile(path):
+ return path, os.path.join('cppyy_backend', os.path.basename(path))
+
+ return [datafile(filename) for filename in all_files if os.path.isdir(filename)]
+
+and replace the empty datas in the Analysis object with:
+ datas=backend_files(),
+
+If you also need the API headers from CPyCppyy, then these can be found e.g. like so:
+def api_files():
+ import cppyy, os
+
+ paths = str(cppyy.gbl.gInterpreter.GetIncludePath()).split('-I')
+ for p in paths:
+ if not p: continue
+
+ apipath = os.path.join(p.strip()[1:-1], 'CPyCppyy')
+ if os.path.exists(apipath):
+ return [(apipath, os.path.join('include', 'CPyCppyy'))]
+
+ return []
+
+and added to the Analysis object:
+ datas=backend_files()+api_files(),
+
+Note however, that Python.h then also needs to exist on the system where the package will be deployed. If need be, Python.h can be found through module sysconfig and its path provided through cppyy.add_include_path in the bootstrap.py file discussed below.
+Next, consider the precompiled header (file cppyy_backend/etc/allDict.cxx.pch): this contains the C++ standard headers in LLVM intermediate representation. If addded, it pre-empts the need for a system compiler where the package is deployed. However, if there is a system compiler, then ideally, the PCH should be recreated on first use after deployment.
+As is, however, the loader.py script in cppyy_backend uses sys.executable which is broken by the freezing (meaning, it's the top-level script, not python, leading to an infinite recursion). And even when the PCH is available, its timestamp is compared to the timestamp of the include directory, and rebuild if older. Since both the PCH and the include directory get new timestamps based on copy order, not build order, this is unreliable and may lead to spurious rebuilding. Therefore, either disable the PCH, or disable the time stamp checking.
+To do so, choose one of these two options and write it in a file called bootstrap.py, by uncommenting the desired behavior:
+### option 1: disable the PCH altogether
+
+# import os
+# os.environ['CLING_STANDARD_PCH'] = 'none'
+
+### option 2: force the loader to declare the PCH up-to-date
+
+# import cppyy_backend.loader
+#
+# def _is_uptodate(*args):
+# return True
+#
+# cppyy_backend.loader._is_uptodate = _is_uptodate
+
+then add the bootstrap as a hook to the spec file in the Analysis object:
+ runtime_hooks=['bootstrap.py'],
+
+As discussed above, the bootstrap.py is also a good place to add more include paths as necessary, e.g. for Python.h.
+Finally, run as usual:
+$ pyinstaller example.spec
+
",python
+"Why is Pipenv not picking up my Pyenv versions?My system Python version is 3.8.5, however I use pyenv to manage an additional version, 3.6.0, to mirror the server version my project is deployed to. I previously used virtualenv + virtualenvwrapper to manage my virtual environments, but I've heard great things on pipenv and thought I would give it a go. It's all great until I try using Python 3.6.0. Flow goes something like this:
+$ mkdir test_project && cd test_project
+$ pyenv shell 3.6.0
+$ pipenv install django
+Creating a virtualenv for this project…
+Pipfile: /home/user/projects/test_project/Pipfile
+Using /home/user/.pyenv/shims/python (3.6.0) to create virtualenv…
+⠸ Creating virtual environment...created virtual environment CPython3.8.5.final.0-64 in 130ms
+ creator CPython3Posix(dest=/home/user/.local/share/virtualenvs/test_project-eAvoynKo-/home/user, clear=False, global=False)
+ seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/user/.local/share/virtualenv)
+ added seed packages: pip==20.2.3, setuptools==50.3.0, wheel==0.35.1
+ activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
+
+✔ Successfully created virtual environment!
+Traceback (most recent call last):
+ File "/home/user/.pyenv/versions/3.6.0/bin/pipenv", line 11, in <module>
+ sys.exit(cli())
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 829, in __call__
+ return self.main(*args, **kwargs)
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 782, in main
+ rv = self.invoke(ctx)
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 1259, in invoke
+ return _process_result(sub_ctx.command.invoke(sub_ctx))
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 1066, in invoke
+ return ctx.invoke(self.callback, **ctx.params)
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 610, in invoke
+ return callback(*args, **kwargs)
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/decorators.py", line 73, in new_func
+ return ctx.invoke(f, obj, *args, **kwargs)
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 610, in invoke
+ return callback(*args, **kwargs)
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/vendor/click/decorators.py", line 21, in new_func
+ return f(get_current_context(), *args, **kwargs)
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/cli/command.py", line 252, in install
+ site_packages=state.site_packages
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/core.py", line 1928, in do_install
+ site_packages=site_packages,
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/core.py", line 580, in ensure_project
+ pypi_mirror=pypi_mirror,
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/core.py", line 512, in ensure_virtualenv
+ python=python, site_packages=site_packages, pypi_mirror=pypi_mirror
+ File "/home/user/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pipenv/core.py", line 986, in do_create_virtualenv
+ with open(project_file_name, "w") as f:
+FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.local/share/virtualenvs/test_project-eAvoynKo-/home/user/.pyenv/shims/python/.project'
+
+
+I came across this previous question Pipenv not recognizing Pyenv version? and set the environment variable PIPENV_PYTHON="$PYENV_ROOT/shims/python in my .bashrc file. to no avail.
+Using the system Python version 3.8.5 works flawlessly:
+$ pyenv install django
+Creating a virtualenv for this project…
+Pipfile: /home/user/projects/test_project/Pipfile
+Using /home/user/.pyenv/shims/python (3.8.5) to create virtualenv…
+⠹ Creating virtual environment...created virtual environment CPython3.8.5.final.0-64 in 114ms
+ creator CPython3Posix(dest=/home/user/.local/share/virtualenvs/test_project-eAvoynKo-/home/user/.pyenv/shims/python, clear=False, global=False)
+ seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/user/.local/share/virtualenv)
+ added seed packages: pip==20.2.2, setuptools==50.3.0, wheel==0.35.1
+ activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
+
+✔ Successfully created virtual environment!
+Virtualenv location: /home/user/.local/share/virtualenvs/test_project-eAvoynKo-/home/user/.pyenv/shims/python
+Creating a Pipfile for this project…
+Installing django…
+Adding django to Pipfile's [packages]…
+✔ Installation Succeeded
+Pipfile.lock not found, creating…
+Locking [dev-packages] dependencies…
+Locking [packages] dependencies…
+Building requirements...
+Resolving dependencies...
+✔ Success!
+Updated Pipfile.lock (a6086c)!
+Installing dependencies from Pipfile.lock (a6086c)…
+ ▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉ 0/0 — 00:00:00
+To activate this project's virtualenv, run pipenv shell.
+Alternatively, run a command inside the virtualenv with pipenv run.
+
+Update
+While I still can't get it to recognize the python version activated with pyenv shell x.x.x, by removing the PIPENV_PYTHON environment variable, and creating a new virtualenv with pipenv install --python 3.6 pipenv does recognize the pyenv version installed.
","pipenv doesn't respect pyenv local and pyenv global (reference)
+maybe it also doesn't respect pyenv shell
+I usually do what you did, specify the python like pipenv install --python 3.7
",python
+"while loop equivalent function/snippet in awkward arrayI have a function which I would like to convert so that i can use with awkward array 1.
+Function following which used to work for float but not for awkward arrays for the known reasons.
+def Phi_mpi_pi(x):
+ kPI=(3.14159265)
+ kTWOPI = 2 * kPI
+
+ #while ((x.any() >= kPI).any()): x = x - kTWOPI;
+ #while ((x.any() < -kPI).any()): x = x + kTWOPI;
+ while ((x >= kPI)): x = x - kTWOPI;
+ while ((x < -kPI)): x = x + kTWOPI;
+ return x;
+
+I tried to convert it into numpy/awkward compatible form and new function look like
+def Phi_mpi_pi(x):
+ kPI=numpy.array(3.14159265)
+ kPI = kPI.repeat(len(x))
+ kTWOPI = 2 * kPI
+ while ((x >= kPI)): x = x - kTWOPI;
+ while ((x < -kPI)): x = x + kTWOPI;
+ return x;
+
+This function remains stuck in the while loop forever, I couldn't find a way to debug it.
+Task of the function is to keep the values in an awkward array between +- kPI but this logic does not give the desired results.
+e.g.
+x=ak.Array([[0.7999999999999998, 1.0, -1.3], [], [-1.4], [-1.8000000000000003, -6.1000000000000005, -1.6000000000000005], [-4.6]])
+
+However ((x < -kPI)) this give desired output.
+>>> ak.to_list(x <= -kPI)
+ [[False, False, False], [], [False], [False, True, False], [True]]
+
+but not the function
+the desired output should be b/w +- kPI based on the logic of while loop, is there something straightforward or suggestion which can be used?
","Okay, got it. You want to adjust every single scalar value in x (which are all angles) to be between -π and π.
+You can do it like this:
+def Phi_mpi_pi(x):
+ y = numpy.add(x, numpy.pi)
+ y = numpy.mod(y, 2*numpy.pi)
+ y = numpy.subtract(y, numpy.pi)
+ return y
+
+Or, more terse and much less readable:
+def Phi_mpi_pi(x):
+ return numpy.subtract(numpy.mod(numpy.add(x, numpy.pi), 2*numpy.pi), numpy.pi)
+
+What it does is this:
+
+- Add π to all angles (so they point in the opposite direction).
+- Take modulo 2π on all angles so they are all from 0 to 2π (2π not included).
+- Subtract π from all the angles again (so they point in the right direction again). Now they are all from -π to +π (+π not included).
+
+Test:
+x = ak.Array([[0.3, 3.1, numpy.pi, -numpy.pi, -4 * numpy.pi,
+ 200 * numpy.pi, 2 * numpy.pi, -400 * numpy.pi],
+ [], [-1.4], [-1.8, -6, -1.6], [-4.6]])
+y = Phi_mpi_pi(x)
+print("Type of result:", type(y))
+print("Result =", y)
+
+# Check that the resulting range of each value is correct.
+range_is_correct = (ak.all(y >= -numpy.pi) and ak.all(y < numpy.pi))
+
+# Calculate the factors of the 2π adjustments.
+factors = (x - y) / (2 * numpy.pi)
+print("2π factors=", factors)
+
+# Test that all factors of the 2π adjustmenst are approximate integers.
+adjustments_are_correct = ak.all(numpy.abs(numpy.mod(factors, 1)) < 1e-15)
+
+# Test that all values are correct, given the test input.
+print("Result is correct:", range_is_correct and adjustments_are_correct)
+
+gives this output:
+Type of result: <class 'awkward1.highlevel.Array'>
+Result = [[0.3, 3.1, -3.14, -3.14, 0, 1.78e-14, 0, 4.62e-14, ... [-1.8, 0.283, -1.6], [1.68]]
+2π factors= [[2.65e-17, 7.07e-17, 1, 0, -2, 100, 1, -200], [], [0], [0, -1, 0], [-1]]
+Result is correct: True
+
+which proves that the operation was correctly performed with the specifically used test data.
+Was that what you wanted?
",python
+"Gradient colour for scatter plot based on age similarityI have people from different communities that I would like to plot visually using a scatter plot with different colours based on their distance: if they are close to other points than they should take a similar colour.
+I know that for a scatter plot I should use the following
+ax1 = df.plot.scatter(x='Year',
+ y='Community')
+
+but I do not know how to set a similar condition.
+For example:
+Community Year
+com1 2006
+com2 2012
+com3 2006
+com4 2013
+com5 1996
+com6 2008
+...
+
+I should have com1 and com3 with same colour and com6 a colour similar, while com2 and com4 should have similar colour to each other.
+I think it is something about gradient.
+Can you give me some information on how to do it in python?
","Let's just pass Year as color:
+ax1 = df.plot.scatter(x='Year',y='Community', c=df['Year'], cmap='hot')
+
+Output:
+![]()
",python
+"PyInstaller won't import pywin32 / win32clipboard - ImportError upon running executableI'm working in Windows 10 with Python 3.8.6 and using PyInstaller 4.0 to compile my script as an executable for distribution. I just added a feature today that required importing win32clipboard. PyInstaller finishes compiling without any errors, but the excecutable fails to load due to:
+
+ImportError: DLL load failed while importing win32clipboard: The specified module could not be found.
+
+I attempted to compile the program again using the hidden-import flag:
+pyinstaller myscript.py --onefile --hidden-import win32clipboard
+This produced the same result and an ImportError upon trying to load the program (no errors during compiling).
+I know that win32clipboard is part of pywin32 and my program compiled and ran without any issues prior to the code changes that required importing it. It still runs fine out of IDLE and functions as intended when using the win32clipboard-enabled features.
+Is there some way to manually direct PyInstaller to import this correctly, or some other way to fix this issue and get the executable working again?
","I was able to work around this issue by importing pywintypes into my script before win32clipboard.
+import pywintypes
+import win32clipboard
+
+Found the suggestion in an old GitHub bug report for an issue people were having importing win32api with PyInstaller and decided to give it a try. I was able to compile and run my program without any issues after doing this.
",python
+"Readlines causing error after many lines?I'm working on a NRE task at the moment, with data from wnut17train.conll (https://github.com/leondz/emerging_entities_17). It's basically a collection of tweets where each line is a single word from the tweet with an IOB tag attached (separated by a \t). Different tweets are separated by a blank line (actually, and weirdly enough if you ask me, a '\t\n' line).
+So, for reference, a single tweet would look like this:
+@paulwalk IOBtag
+... ...
+foo IOBtag
+[\t\n]
+@jerrybeam IOBtag
+... ...
+bar IOBtag
+
+The goal for this first step is to achieve a situation where I converted this data set into a training file looking like this:
+train[0] = [(first_word_of_first_tweet, POStag, IOBtag),
+(second_word_of_first_tweet, POStag, IOBtag),
+...,
+last_word_of_first_tweet, POStag, IOBtag)]
+
+This is what I came up so far:
+tmp = []
+train = []
+nlp = spacy.load("en_core_web_sm")
+with open("wnut17train.conll") as f:
+ for l in f.readlines():
+ if l == '\t\n':
+ train.append(tmp)
+ tmp = []
+ else:
+ doc = nlp(l.split()[0])
+ for token in doc:
+ tmp.append((token.text, token.pos_, token.ent_iob_))
+
+Everything works smoothly for a certain amount of tweets (or lines, not sure yet), but after that I get a
+IndexError: list index out of range
+
+raised by
+doc = nlp(l.split()[0])
+
+First time I got it around line 20'000 (20'533 to be precise), then after checking that this was not due to the file (maybe a different way of separating tweets, or something like this that might have tricked the parser) I removed the first 20'000 lines and tried again. Again, I got an error after around line 20'000 (20'260 - or 40'779 in the original file - to be precise).
+I did some research on readlines() to see if this was a known problem but it looks like it's not. Am I missing something?
","I used the wnut17train.conll file from https://github.com/leondz/emerging_entities_17 and I ran a similar code to generate your required output. I found that in some lines instead of "\t\n" as the blank Line we have only "\n".
+Due to this l.split() will give an IndexError: list index out of range. To handle this we can check if length is 1 and in that case also we add our tmp to train.
+import spacy
+nlp = spacy.load("en_core_web_sm")
+train = []
+tmp = []
+with open("wnut17train.conll") as fp:
+ for l in fp.readlines():
+ if l == "\t\n" or len(l) == 1:
+ train.append(tmp)
+ tmp = []
+ else:
+ doc = nlp(l.split("\t")[0])
+ for token in doc:
+ tmp.append((l.split("\t")[0], token.pos_, l.split("\t")[1]))
+
+Hope your question is resolved.
",python
+"Adding numbers in the same indices in list of listsSay I have a list a=[[1,2,3,4],[5,6,7,8],[9,10,11,12]]. I want to create a new list, b with each value in the new list being the sum of all values in that index location of each sub-list.
+So in this case it would be [15,18,21,24] (1+5+9, 2+6+10, 3+7+11, 4+8+12)
+. This is what my code looks like at the moment.
+a=[[1,2,3,4],[5,6,7,8],[9,10,11,12]]
+for i in range(len(a)+1):
+ b.append(sum(b[i] for b in a))
+print(b)
+>>> [15, 18, 21, 24]
+
+I tried to use list comprehension to simplify into:
+b=[sum([c[i]] for c in a) for i in range(len(a)+1)]
+however I get an error TypeError: unsupported operand type(s) for +: 'int' and 'list'
+Ive tried googling the problem, but all I can find is people adding lists to integers. In this code it should only be adding integers (a[c[i]]). Whatve I done wrong?
+EDIT: as Marc Ittel pointed out, there [c[i]] should just be c[i]. However as Yatu pointed out, using map and zip is much simpler.
+Also as everyone has pointed out, it should not be len(a)+1 but rather len(a[0])
+Thanks a lot everyone!
","This can be done quite simply using python's built-ins:
+list(map(sum,zip(*a)))
+#[15, 18, 21, 24]
+
+Your approach produces the expected result, not sure how you defined b though? If it is an empty list b=[], this works fine.
+Also, are you sure about this for - for i in range(len(a)+1)?. Shouldn't you be iterating over as many items as you have in the inner lists?
+Your list comprehension should be:
+[sum(b[i] for b in a) for i in range(4)]
+# [15, 18, 21, 24]
+
+In your code, you have [c[i]] in the inner level. You're generating a list of lists which you don't need. Just index the list and keep the integer b[i].
",python
+"adding rows to pandas dataframe within loopThere is a table that I have scraped from a site which I need to convert to a dataframe. Its html dom looks like this:
+<tbody>
+ <tr>
+ <td>value1</td>
+ <td>value2</td>
+ <td> </td>
+ ...
+ <tr>
+ <td>value1</td>
+ <td> </td>
+ <td> </td>
+ ...
+
+I am using beautifulsoup to scrape the page:
+table=soup.find('tbody')
+for row in soup.find_all('tr'):
+ value=row.find('td')
+ print(value.text)
+
+I want to append this value.text to rows to a data frame including the values (as NaN).
+this is a sample output of print(value.text)(the blank spaces represent values):
+20Q4 FDLR WW Event Webinar 13 FixIssues - Didn't Attend
+205
+204
+0
+0.00%
+1
+0.49%
+1
+0.49%
+179
+87.75%
+65
+31.86%
+3
+1.47%
+3
+1.47%
+3
+
+4.62%
+1
+0.49%
+1
+0
+0.00%
+0
+0.00%
+0
+
+The first contains the headers of the table.
+How do I go about doing so? Thanks a bunch! :)
","You can simply use the pd.read_html function to convert the html into a dataframe. Here is how you do it:
+import pandas as pd
+
+table=soup.find('table') #Important thing to note: You have to provide the entire table to pd.read_html, not just the body of the table. Only then it would work.
+
+dfs = pd.read_html(str(table))
+
+df = dfs[0] #The output of pd.read_html is a list. In order to access your table (i.e the first and last element of the list), you can use dfs[0]
+
",python
+"Django - ListView url not connecting to desired viewI am new to Django and have hit a wall with a certain part of my project and I hope someone can help.
+I have two ListViews in my views.py file which I would like to work similar to published/draft posts (I'm actually using sanitised and unsanitised reports). Currently, every time I try to access the "Unsanitised" list view (unsanitised_list.html), it just directs me to the the sanitised list view (intelreport_list.html)
+views.py:
+class IntelReportListView(ListView):
+ model = IntelReport
+ context_object_name = 'all_logs'
+
+ def get_queryset(self):
+ return IntelReport.objects.filter(create_date__lte=timezone.now()).order_by('-create_date')
+
+
+
+class UnsanitisedListView(LoginRequiredMixin, ListView):
+ login_url = '/login/'
+ redirect_field_name = 'intel_db/unsanitised_list.html'
+
+ model = IntelReport
+
+ def get_queryset(self):
+ return IntelReport.objects.filter(sanitised__isnull=True).order_by('-create_date')
+
+models.py
+class IntelReport(models.Model):
+ gpms_choices = (
+ ***REDACTED***
+ )
+ gpms = models.CharField(max_length=20, blank=True, null=True, choices=gpms_choices)
+
+ officer = models.CharField(max_length=50)
+ create_date = models.DateTimeField(auto_now_add=True)
+ sanitised = models.BooleanField(default=False)
+
+ source_eval_choices = (
+ ***REDACTED****
+ )
+ source_eval = models.CharField(max_length=50, blank=True, null=True, choices=source_eval_choices)
+
+ intel_eval_choices = (
+ ***REDACTED***
+ )
+ intel_eval = models.CharField(max_length=100, blank=True, null=True, choices=intel_eval_choices)
+
+ report = models.TextField(max_length=5000, blank=True, null=True)
+
+ def sanitised_log(self):
+ self.sanitised = True
+ self.save()
+
+ def get_absolute_url(self):
+ return reverse('log_details', kwargs={'pk':self.pk})
+
+ def __str__(self):
+ return str(self.pk)
+
+urls.py
+from django.urls import path
+from intel_db import views
+urlpatterns = [
+ path('welcome/', views.AboutView.as_view(), name='about'),
+ path('logs/', views.IntelReportListView.as_view(), name='log_list'),
+ path('logs/<int:pk>/', views.IntelReportDetailView.as_view(), name='log_detail'),
+ path('logs/new_log/', views.IntelReportCreateView.as_view(), name='new_log'),
+ path('unsanitised/', views.UnsanitisedListView.as_view(), name='unsanitised'),
+ path('logs/<int:pk>/sanitise_log/', views.sanitsed_report, name='sanitised_report'),
+]
+
+and on my landing page (landing.html), this is the link I'm using to try and reach the unsanitised_list.html:
+ **<a href="{% url 'unsanitised' %}">**
+
+
+I cannot figure out why it keeps redirecting me to intelreport_lists.html (the sanitised logs) rather than unsanitised_list.html (the unsanitised logs).
+I hope I'm not just missing something really simple but I've been over it and tried to re-write it innumerable times and can't get it right.
+I hope this is enough information and any help would be greatly appreciated.
+Cheers
","You just have to override template_name when you extend ListView. I mean update your IntelReportListView and UnsanitisedListViewlike this,
+class IntelReportListView(ListView):
+ model = IntelReport
+ context_object_name = 'all_logs'
+ template_name = 'YOUR_APP_NAME/intelreport_list.html'
+
+ def get_queryset(self):
+ return IntelReport.objects.filter(create_date__lte=timezone.now()).order_by('-create_date')
+
+
+
+class UnsanitisedListView(LoginRequiredMixin, ListView):
+ login_url = '/login/'
+ redirect_field_name = 'intel_db/unsanitised_list.html'
+ template_name = 'YOUR_APP_NAME/unsanitised_list.html'
+ model = IntelReport
+
+ def get_queryset(self):
+ return IntelReport.objects.filter(sanitised__isnull=True).order_by('-create_date')
+
+If you are interested to find out why it was redirecting to intelreport_list.html rather than unsanitised_list.html, whenever you extend ListView it will look for MODEL_NAME_list.html by default, where MODEL_NAME is name of the model that you have used inside your list views (in lower case). Since you have used model = IntelReport inside UnsanitisedListView, it's redirecting to intelreport_lists.html
",python
+"How to find and print a dictionary key/value that matches user input?I need to print a dictionary value that matches the input of the user. For example, if the user enters the course number CS101 the output will look like:
+The details for CS101 are:
+Room: 3004
+Instructor: Haynes
+Time: 8:00 a.m.
+
+However, if the user enters an incorrect/invalid course number, I need to print out a message letting them know:
+CS101 is an invalid course number.
+
+I have tried if, for loops, and while loops. The problem is, every time I get the course info printed, the invalid course number message won't display because of KeyError. On the other hand, if I happen to "fix" the error message, then the course number info won't print out and instead will return a NameError / TypeError.
+I will be honest, I have struggled for some time now with this, and I feel as though I am either assigning something incorrectly or printing incorrectly. But I am a beginner and I don't have a great grasp on Python yet, which is why I am asking for help.
+Unfortunately, I am not allowed to create one entire dictionary to group everything in (which would have been easier for me), but instead, I have to create 3 dictionaries.
+This is the code:
+room = {}
+
+room["CS101"] = "3004"
+room["CS102"] = "4501"
+room["CS103"] = "6755"
+room["NT110"] = "1244"
+room["CM241"] = "1411"
+
+instructor = {}
+
+instructor["CS101"] = "Haynes"
+instructor["CS102"] = "Alvarado"
+instructor["CS103"] = "Rich"
+instructor["NT110"] = "Burkes"
+instructor["CM241"] = "Lee"
+
+time = {}
+
+time["CS101"] = "8:00 a.m."
+time["CS102"] = "9:00 a.m."
+time["CS103"] = "10:00 a.m."
+time["NT110"] = "11:00 a.m."
+time["CM241"] = "1:00 p.m."
+
+def info():
+ print(f'College Course Locater Program')
+ print(f'Enter a course number below to get information')
+
+info()
+get_course = input(f'Enter course number here: ')
+print(f'----------------------------------------------')
+
+course_num = get_course
+number = course_num
+name = course_num
+meeting = course_num
+
+if number in room:
+ if name in instructor:
+ if meeting in time:
+ print(f'The details for course {get_course} are: ')
+ print(f'Room: {number["room"]}')
+ print(f'Instructor: {name["instructor"]}')
+ print(f'Time: {meeting["time"]}')
+else:
+ print(f'{course_num} is an invalid course number.')
+
+I have also tried formatting dictionaries in this style:
+time_dict = {
+ "CS101": {
+ "Time": "8:00 a.m."
+ },
+ "CS102": {
+ "Time": "9:00 a.m."
+ },
+ "CS103": {
+ "Time": "10:00 a.m."
+ },
+ "NT110": {
+ "Time": "11:00 a.m."
+ },
+ "CM241": {
+ "Time": "1:00 p.m."
+ },
+}
+
+I thank everyone in advance who has an advice, answer, or suggestions to a solution.
","This code here is unnecessary, because you are essentially setting 4 variables all to the same value get_course:
+course_num = get_course
+number = course_num
+name = course_num
+meeting = course_num
+
+This code here doesn't work because you are trying to find a key with string "room" in a dictionary that doesn't exist, and same with the other lines afterwards
+print(f'Room: {number["room"]}')
+print(f'Instructor: {name["instructor"]}')
+print(f'Time: {meeting["time"]}')
+
+I replaced the code above with this:
+print(f'Room: {room[get_course]}')
+print(f'Instructor: {instructor[get_course]}')
+print(f'Time: {time[get_course]}')
+
+This searches the dictionary variable room for the key get_course (ex. "CS101") and returns the value corresponding to that key. The same thing happens for the other lines, except with the dictionary instructor and the dictionary time.
+Here is the final code:
+room = {}
+
+room["CS101"] = "3004"
+room["CS102"] = "4501"
+room["CS103"] = "6755"
+room["NT110"] = "1244"
+room["CM241"] = "1411"
+
+instructor = {}
+
+instructor["CS101"] = "Haynes"
+instructor["CS102"] = "Alvarado"
+instructor["CS103"] = "Rich"
+instructor["NT110"] = "Burkes"
+instructor["CM241"] = "Lee"
+
+time = {}
+
+time["CS101"] = "8:00 a.m."
+time["CS102"] = "9:00 a.m."
+time["CS103"] = "10:00 a.m."
+time["NT110"] = "11:00 a.m."
+time["CM241"] = "1:00 p.m."
+
+def info():
+ print(f'College Course Locater Program')
+ print(f'Enter a course number below to get information')
+
+info()
+get_course = input(f'Enter course number here: ')
+print(f'----------------------------------------------')
+
+
+if get_course in room and get_course in instructor and get_course in time:
+ print(f'The details for course {get_course} are: ')
+ print(f'Room: {room[get_course]}')
+ print(f'Instructor: {instructor[get_course]}')
+ print(f'Time: {time[get_course]}')
+else:
+ print(f'{get_course} is an invalid course number.')
+
+Here is a test with the input "CS101":
+College Course Locater Program
+Enter a course number below to get information
+Enter course number here: CS101
+----------------------------------------------
+The details for course CS101 are:
+Room: 3004
+Instructor: Haynes
+Time: 8:00 a.m.
+
",python
+"How to calculate NDCG with binary relevances using sklearn?I'm trying to calculate the NDCG score for binary relevances:
+from sklearn.metrics import ndcg_score
+y_true = [0, 1, 0]
+y_pred = [0, 1, 0]
+ndcg_score(y_true, y_pred)
+
+And getting:
+ValueError: Only ('multilabel-indicator', 'continuous-multioutput',
+'multiclass-multioutput') formats are supported. Got binary instead
+
+Is there a way to make this work?
","Please try:
+from sklearn.metrics import ndcg_score
+y_true = [[0, 1, 0]]
+y_pred = [[0, 1, 0]]
+ndcg_score(y_true, y_pred)
+1.0
+
+Note the expected shapes in the docs:
+
+y_true: ndarray, shape (n_samples, n_labels)
+y_score: ndarray, shape (n_samples, n_labels)
+
",python
+"""init_dgelsd failed init"" When using import numpy as npI am trying to run this simple code
+import numpy as np
+
+my_first_array = np.array([1, 2, 3,4,5])
+
+my_first_array([1, 2, 3, 4, 5])
+
+I believe I am using python 3.9 as i just bought this computer and downloaded the newest version.
+But keep getting the error code:
+Traceback (most recent call last):
+ File "/Users/oledy/Documents/Skole/Dat200/test.py", line 1, in <module>
+ import numpy as np
+ File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/numpy/__init__.py", line 286, in <module>
+ raise RuntimeError(msg)
+RuntimeError: Polyfit sanity test emitted a warning, most likely due to using a buggy Accelerate backend. If you compiled yourself, see site.cfg.example for information. Otherwise report this to the vendor that provided NumPy.
+RankWarning: Polyfit may be poorly conditioned
+
","If you care to read up on the issue here is where I found the solution below.
+rm -v ~/Library/Caches/pip/wheels/*/*/*/*/*numpy* # clear the pip wheel cache of any built numpy wheels
+brew install openblas # make sure OpenBLAS is installed
+# activate your virtualenv
+OPENBLAS="$(brew --prefix openblas)" pip install numpy # let numpy's setup.py know where OpenBLAS is installed
+
",python
+"Transform a Pandas Series into a Dataframe with a for loopThank in advance for anyone's help.
+I'm trying to transform this Pandas series into a Dataframe with the following logic.
+Any time a row from the series starts with "MB" it should create another column in the dataframe, and all the rows below it until the next "MB" should go under that column.
+MB104
+TR15
+TR16
+SP16
+MB301
+TR16
+SP11
+SP16
+SP26
+SP67
+MB302
+TR15
+MB504
+TR15
+SP16
+SP67
+SP109
+MB652
+SP109
+SP110
+
+Into this:
+MB104 MB031 MB302 MB504 MB652
+TR15 TR16 TR15 TR15 SP109
+TR16 SP11 SP16 SP110
+SP16 SP16 SP67
+ SP26 SP109
+ SP67
+
+And this is what I've tried so far
+mbdf = pd.DataFrame()
+assetlist = []
+for row in mbs.itertuples():
+ left2 = row.data[:2]
+ if left2 == 'MB':
+ if headername:
+ mbdf[headername] = pd.Series(assetlist)
+
+ headername = row.data
+ assetlist = []
+else:
+ assetname = row.data
+ assetlist.append(assetname)
+
","It's unclear from your question whether you want them as separate Series or together in the same DataFrame. I assume you want a DataFrame:
+# Read the data
+from collections import defaultdict
+data = defaultdict(list)
+col = None
+
+with open('data.txt') as fp:
+ for line in fp:
+ line = line.strip('\n')
+ if line.startswith('MB'):
+ col = line
+ else:
+ data[col].append(line)
+
+If you want a collection of series:
+series = [pd.Series(value, name=key) for key, value in data.items()]
+
+If you want a DataFrame:
+# Pad every column to the same length
+max_len = max(len(v) for v in data.values())
+
+for key, value in data.items():
+ value += [None for _ in range(max_len - len(value))]
+
+df = pd.DataFrame(data)
+
",python
+"Find repeated sentences within textI would like to know how I could find similarity within the same sentence.
+I have a list of sentences like these:
+my_list=["do you want pizza for dinner? Do you want pizza for dinner?", "I like pizza", "I have no money I have no money"]
+
+I would like to create a pandas dataframe where, if a sentence is repeated within the same, I assign 1, otherwise 0.
+Something like this:
+Text Repeated?
+do you want pizza for dinner? Do you want pizza for dinner? 1
+I like pizza 0
+I have no money I have no money 1
+
+I was thinking of something like this:
+from collections import Counter
+
+
+my_list = dict(Counter(my_list.split()))
+for i in sorted(my_list.keys()):
+ print ('"'+i+'" is repeated '+str(my_list[i])+' time.')
+
+Then counting how many words there are in total and how many unique words there are in total in that sentence. But I think it would be not good as coding.
+Do you know if there is another way to get the expected result?
","You can use regular expression for the task (regex101):
+import re
+import pandas as pd
+
+my_list=["do you want pizza for dinner? Do you want pizza for dinner?", "I like pizza", "I have no money I have no money"]
+df = pd.DataFrame({'Text': my_list})
+
+r = re.compile(r'(.+)\s*\1$', flags=re.I)
+df['Repeated'] = df['Text'].apply(lambda x: bool(r.match(x))).astype(int)
+print(df)
+
+Prints:
+ Text Repeated
+0 do you want pizza for dinner? Do you want pizz... 1
+1 I like pizza 0
+2 I have no money I have no money 1
+
",python
+"Getting import error when importing db from app.pyapp.py file
+from flask import Flask
+from flask_sqlalchemy import SQLAlchemy
+from database import CreateQuestions
+
+app = Flask(__name__)
+app.config['SQLALCHEMY_DATABASE_URI'] = "postgres://jim:password@localhost:5433/exquizdb"
+db = SQLAlchemy(app)
+
+
+@app.route('/questions')
+def post_question():
+ q = 'hey'
+ a = 'hi'
+ CreateQuestions.Create_questions(q, a)
+
+
+@app.route('/')
+def hello_world():
+ return 'Hello World!'
+
+
+if __name__ == '__main__':
+ app.run()
+
+database.py
+# database.py
+from app import db
+
+
+class Questions(db.Model):
+ question_id = db.Column(db.Integer, primary_key=True)
+ question = db.Column(db.Text)
+
+
+class Answers(db.Model):
+ answer_id = db.Column(db.Integer, primary_key=True)
+ answer = db.Column(db.Text)
+ question_id = db.Column(db.Integer, db.ForeignKey('questions.question_id'))
+
+
+class CreateQuestions:
+ def create_questions(self, q, a):
+ ques = Questions(question=q)
+ ans = Answers(answer=a)
+ db.create_all()
+ db.session.add(ques)
+ db.session.add(ans)
+ db.session.commit()
+
+when I run the app, the crashes with 'ImportError: cannot import name 'db' from 'app''. I don't know what I am doing wrong. When I do them on the same file it works. when I takeout the database Model from the app.py file it doesnt work anymore. Thank you for your help
","import database after you instantiate your db.
+from flask import Flask
+from flask_sqlalchemy import SQLAlchemy
+
+app = Flask(__name__)
+app.config['SQLALCHEMY_DATABASE_URI'] = "postgres://jim:password@localhost:5433/exquizdb"
+db = SQLAlchemy(app)
+from database import CreateQuestions
+
",python
+"How to retrieve attributes from selected datum in Altair?I have a Streamlit dashboard which lets me interactively explore a t-SNE embedding using an Altair plot. I am trying to figure out how to access the metadata of the selected datum so that I can visualize the corresponding image. In other words, given:
+selector = alt.selection_single()
+chart = (
+ alt.Chart(df)
+ .mark_circle()
+ .encode(x="tSNE_dim1", y="tSNE_dim2", color="predicted class", tooltip=["image url", "predicted class"])
+ .add_selection(selector)
+)
+
+...is there something akin to
+selected_metadata = selector.tooltip
+update_dashboard_img(img=selected_metadata["image url"], caption=selected_metadata["predicted class"])
+
+I am aware about image marks but the images are on S3 and there are too many of them to fit in the plot.
","I hate to disagree with the creator of Altair, but I was able to achieve this using streamlit-vega-lite package. This works by wrapping the call to the chart creation function with altair_component():
+ from streamlit_vega_lite import altair_component
+ ...
+
+ event_dict = altair_component(altair_chart=create_tsne_chart(tsne_df))
+ # note: works with selector = alt.selection_interval(), not selection_single()
+ dim1_bounds, dim2_bounds = event_dict.get("dim1"), event_dict.get("dim2")
+ if dim1_bounds:
+ (dim1_min, dim1_max), (dim2_min, dim2_max) = dim1_bounds, dim2_bounds
+ selected_images = tsne_df[
+ (tsne_df.dim1 >= dim1_min)
+ & (tsne_df.dim1 <= dim1_max)
+ & (tsne_df.dim2 >= dim2_min)
+ & (tsne_df.dim2 <= dim2_max)
+ ]
+ st.write("Selected Images")
+ st.write(selected_images)
+
+ if len(selected_images) > 0:
+ for _index, row in selected_images.iterrows():
+ img = get_img(row["image url"])
+ st.image(img, caption=f"{row['image url']} {row['predicted class']}", use_column_width=True)
+
+The event_dict only contains information about the selector bounds. So, you have to use those values to reselect the data that was selected in the interactive chart.
+Note that this package is a POC and has various limitations. Please upvote the Streamlit feature request created by the author of streamlit_vega_lite.
",python
+"Sublime Text 3: How do I make a keybind to enable and disable Anaconda linting?I want to be able to use a hotkey to enable/disable anaconda linting. Its really inconvenient to have to open the settings whenever I had to use it. I'm new to Sublime Text but from what I see at the Keybindings, you can pass a variable with the args. For example:
+[{"keys": ["ctrl+q"], "command": "toggle_comment", "args": {"block": false}}]
+
+So, I was thinking, maybe there's a command to change package "settings - user" and pass a var to set ["anaconda_linting": false,] into true or false?
","You can do this with a custom plugin and keybinding. Select Tools → Developer → New Plugin… and set the contents of the file that opens to this:
+import sublime
+import sublime_plugin
+
+
+class ToggleAnacondaLintingCommand(sublime_plugin.ApplicationCommand):
+ def run(self):
+ s = sublime.load_settings("Anaconda.sublime-settings")
+ current = s.get("anaconda_linting")
+ new = not current
+ s.set("anaconda_linting", new)
+ sublime.save_settings("Anaconda.sublime-settings")
+ sublime.active_window().run_command('save')
+
+Hit CtrlS to save, and your Packages/User folder should open up. Save the file as toggle_anaconda_linting.py.
+Now, open up your keybindings and add the following between the [ ] characters (choosing whatever shortcut you want):
+{"keys": ["ctrl+alt+shift+l"], "command": "toggle_anaconda_linting"},
+
+Now, whenever you hit the shortcut, "anaconda_linting" will be toggled for all files.
",python
+"Getting first (or a specific) td in BeautifulSoup with no classI have one of those nightmare tables with no class given for the tr and td tags.
+A sample page is here: https://system.gotsport.com/org_event/events/1271/schedules?age=19&gender=m
+(You'll see in the code below that I'm getting multiple pages, but that's not the problem.)
+I want the team name (nothing else) from each bracket. The output should be:
+
+OCYS
+FL Rush
+Jacksonville FC
+Atlanta United
+SSA
+Miami Rush Kendall SC
+IMG
+Tampa Bay United
+etc.
+
+I've been able to get every td in the specified tables. But every attempt to use [0] to get the first td of every row gives me an "index out of range" error.
+The code is:
+import requests
+import csv
+from bs4 import BeautifulSoup
+
+batch_size = 2
+urls = ['https://system.gotsport.com/org_event/events/1271/schedules?age=19&gender=m', 'https://system.gotsport.com/org_event/events/1271/schedules?age=17&gender=m']
+
+# iterate through urls
+for url in urls:
+ response = requests.get(url)
+ soup = BeautifulSoup(response.content, "html.parser")
+
+
+
+# iterate through leagues and teams
+ leagues = soup.find_all('table', class_='table table-bordered table-hover table-condensed')
+ for league in leagues:
+ row = ''
+ rows = league.find_all('tr')
+ for row in rows:
+ team = row.find_all('td')
+ teamName = team[0].text.strip()
+ print(teamName)
+
+After a couple of hours of work, I feel like I'm just one syntax change away from getting this right. Yes?
","You can use a CSS Selector nth-of-type(n). It works for both links:
+import requests
+from bs4 import BeautifulSoup
+
+url = "https://system.gotsport.com/org_event/events/1271/schedules?age=19&gender=m"
+soup = BeautifulSoup(requests.get(url).content, "html.parser")
+
+for tag in soup.select(".small-margin-bottom td:nth-of-type(1)"):
+ print(tag.text.strip())
+
+Output:
+OCYS
+FL Rush
+Jacksonville FC
+Atlanta United
+SSA
+...
+...
+Real Salt Lake U19
+Real Colorado
+Empire United Soccer Academy
+
",python
+"Performance difference between multithread using queue and futures.ThreadPoolExecutor using list in python3?I was trying various approaches with python multi-threading to see which one fits my requirements. To give an overview, I have a bunch of items that I need to send to an API. Then based on the response, some of the items will go to a database and all the items will be logged; e.g., for an item if the API returns success, that item will only be logged but when it returns failure, that item will be sent to database for future retry along with logging.
+Now based on the API response I can separate out success items from failure and make a batch query with all failure items, which will improve my database performance. To do that, I am accumulating all requests at one place and trying to perform multithreaded API calls(since this is an IO bound task, I'm not even thinking about multiprocessing) but at the same time I need to keep track of which response belongs to which request.
+Coming to the actual question, I tried two different approaches which I thought would give nearly identical performance, but there turned out to be a huge difference.
+To simulate the API call, I created an API in my localhost with a 500ms sleep(for avg processing time). Please note that I want to start logging and inserting to database after all API calls are complete.
+Approach - 1(With threading.Thread and queue.Queue())
+import requests
+import datetime
+import threading
+import queue
+
+def target(data_q):
+ while not data_q.empty():
+ data_q.get()
+ response = requests.get("https://postman-echo.com/get?foo1=bar1&foo2=bar2")
+ print(response.status_code)
+ data_q.task_done()
+
+if __name__ == "__main__":
+ data_q = queue.Queue()
+ for i in range(0, 20):
+ data_q.put(i)
+
+ start = datetime.datetime.now()
+ num_thread = 5
+ for _ in range(num_thread):
+ worker = threading.Thread(target=target(data_q))
+ worker.start()
+
+ data_q.join()
+
+ print('Time taken multi-threading: '+str(datetime.datetime.now() - start))
+
+I tried with 5, 10, 20 and 30 times and the results are below correspondingly,
+Time taken multi-threading: 0:00:06.625710
+Time taken multi-threading: 0:00:13.326969
+Time taken multi-threading: 0:00:26.435534
+Time taken multi-threading: 0:00:40.737406
+What shocked me here is, I tried the same without multi-threading and got almost same performance.
+Then after some googling around, I was introduced to futures module.
+Approach - 2(Using concurrent.futures)
+def fetch_url(im_url):
+ try:
+ response = requests.get(im_url)
+ return response.status_code
+ except Exception as e:
+ traceback.print_exc()
+
+if __name__ == "__main__":
+ data = []
+ for i in range(0, 20):
+ data.append(i)
+
+ start = datetime.datetime.now()
+ urls = ["https://postman-echo.com/get?foo1=bar1&foo2=bar2" + str(item) for item in data]
+ with futures.ThreadPoolExecutor(max_workers=5) as executor:
+ responses = executor.map(fetch_url, urls)
+ for ret in responses:
+ print(ret)
+ print('Time taken future concurrent: ' + str(datetime.datetime.now() - start))
+
+Again with 5, 10, 20 and 30 times and the results are below correspondingly,
+Time taken future concurrent: 0:00:01.276891
+Time taken future concurrent: 0:00:02.635949
+Time taken future concurrent: 0:00:05.073299
+Time taken future concurrent: 0:00:07.296873
+Now I've heard about asyncio, but I've not used it yet. I've also read that it gives even better performance than futures.ThreadPoolExecutor().
+Final question, If both approaches are using threads(or so I think) then why there is a huge performance gap? Am I doing something terribly wrong? I looked around. Was not able to find a satisfying answer. Any thoughts on this would be highly appreciated. Thanks for going through the question.
+[Edit 1]The whole thing is running on python 3.8.
+[Edit 2] Updated code examples and execution times. Now they should run on anyone's system.
","The documentation of ThreadPoolExecutor explains in detail how many threads are started when the max_workers parameter is not given, as in your example. The behaviour is different depending on the exact Python version, but the number of tasks started is most probably more than 3, the number of threads in the first version using a queue. You should use futures.ThreadPoolExecutor(max_workers= 3) to compare the two approaches.
+To the updated Approach - 1 I suggest to modify the for loop a bit:
+for _ in range(num_thread):
+ target_to_run= target(data_q)
+ print('target to run: {}'.format(target_to_run))
+ worker = threading.Thread(target= target_to_run)
+ worker.start()
+
+The output will be like this:
+200
+...
+200
+200
+target to run: None
+target to run: None
+target to run: None
+target to run: None
+target to run: None
+Time taken multi-threading: 0:00:10.846368
+
+The problem is that the Thread constructor expects a callable object or None as its target. You do not give it a callable, rather queue processing happens on the first invocation of target(data_q) by the main thread, and 5 threads are started that do nothing because their target is None.
",python
+"Better way to write nested try except pythonIs there a better way for me to write this without nesting it like this?
+try:
+ if find_path(graph, startPos, (targetX+1,targetY,targetX,targetY)):
+ print('YES')
+except:
+ try:
+ if find_path(graph, startPos, (targetX,targetY+1,targetX,targetY)):
+ print('YES')
+ except:
+ try:
+ if find_path(graph, startPos, (targetX-1,targetY,targetX,targetY)):
+ print('YES')
+ except:
+ try:
+ if find_path(graph, startPos, (targetX,targetY-1,targetX,targetY)):
+ print('YES')
+ except:
+ print('NO')
+
","You could make a list of potential coordinates, and then try them each in turn, in a for loop, breaking the loop on success:
+success = "NO"
+coords = ((targetX+1,targetY), (targetX,targetY+1), (targetX-1,targetY), (targetX,targetY-1))
+
+for coord in coords:
+ try:
+ x, y = coord
+ find_path(graph, startPos, (x, y, targetX,targetY))
+ except:
+ pass
+ else:
+ # Set success state and break out of the for loop
+ success = "YES"
+ break
+print(success)
+
",python
+"Join DataFrame by Comparing columnsI have two dataframe:
+df1:
+![]()
+df2:
+![]()
+I want update column 'D' of df1 such that if df2 have a lesser value for column d for same value of 'A' and 'B' column df.d will replace df1.d for that row.
+expected output is:
+![]()
+Could someone help me with the python code for this?
+Thanks.
","df_new=pd.merge(df1,df2, how='left', on=['A','B'],suffixes=('', '_r'))#merge the two frames on A and B and suffix df2['D'] WITH R
+df_new['D']=np.where(df_new['D']>df_new['D_r'],df_new['D_r'],df_new['D'])#Use np.where to replace column D with the right value as per condition
+df_new.drop('D_r',1, inplace=True)#Drop the D_r column
+
+
+
+ A B C D
+0 x 2 f 1.0
+1 x 3 2 1.0
+2 y 2 4 3.0
+3 y 5 dfs 2.0
+4 z 1 sds 5.0
+
",python
+"Convert column to datetime format, in a leap yearI'm new to Python and programming in general, so I wasn't able to figure out the following: I have a dataframe named ozon, for which column 1 is the time stamp in mm-dd format. Now I want to change that column to a datetime format using the following code:
+ozon[1] = pd.to_datetime(ozon[1], format='%m-%d')
+
+Now this is giving me the following error: ValueError: day is out of range for month.
+I think it has to do with the fact that it's a leap year, so it doesn't recognize February 29 as a valid date. How can I overcome this error? And could I also add a year to the timestamp (2020)?
+Thanks so much in advance!
","Add year to column and also to format:
+ozon[1] = pd.to_datetime(ozon[1] + '-2000', format='%m-%d-%Y')
+
+If still not working because some values are not valid add errors='coerce' parameter:
+ozon[1] = pd.to_datetime(ozon[1] + '-2000', format='%m-%d-%Y', errors='coerce')
+
",python
+"Update dictionary key(s) by drop starts with value from key in PythonI have a dictionary dict:
+dict = {'drop_key1': '10001', 'drop_key2':'10002'}
+
+The key(s) in dict startswith drop_, i would like to update dict by dropping drop_ value from key(s):
+dict = {'key1': '10001', 'key2':'10002'}
+
+What is the best approach to do it?
","something like
+d1 = {'drop_key1': '10001', 'drop_key2':'10002'}
+d2 = {k[5:]:v for k,v in d1.items()}
+print(d2)
+
+output
+{'key1': '10001', 'key2': '10002'}
+
",python
+"Summing up the output of multiple functions in pythonI currently have three sine functions (y1, y2, y3) and would like to sum the output of the functions in a new function (ytotal) but only where the output of the sine functions are greater than 0.
+import numpy as np
+import matplotlib.pyplot as plt
+#%%
+
+phi = np.linspace(-2*np.pi, 2*np.pi, 100)
+
+y1 = 0.2*np.sin(phi)
+y2 = 0.2*np.sin(phi-(120*(np.pi/180)))
+y3 = 0.2*np.sin(phi-(240*(np.pi/180)))
+
+#if y1 or y2 or y3 > 0:
+# ytotal = y1+y2+y3
+
+
+
+
+
+plt.plot(phi,y1, label = "Piston 1")
+plt.plot(phi,y2, label = "Piston 2")
+plt.plot(phi,y3, label = "Piston 3")
+#plt.plot(phi,ytotal, label = "Total output")
+positions = (0,np.pi/3,2*np.pi/3,np.pi,4*np.pi/3,5*np.pi/3,2*np.pi)
+labels = ("0","60","120","180","240","300","360")
+plt.xticks(positions, labels)
+plt.xlabel('Angular displacement')
+plt.ylabel('Stroke')
+plt.legend()
+plt.show()
+
+![]()
+The output should be something like the following:
+![]()
","Do you mean:
+plt.plot(phi, y1.clip(0)+y2.clip(0)+y3.clip(0), label='Total')
+
+Output:
+![]()
",python
+"Not getting data from database in pythonI have tried getting the registered data from a database the program has created when first run.
+database.py
+import sqlite3
+
+CREATE_TABLE = """CREATE TABLE IF NOT EXISTS college (
+ id INTEGER PRIMARY KEY,
+ FirstName TEXT,
+ SecondName TEXT,
+ Position TEXT);"""
+
+INSERT_DATA = "INSERT INTO college (FirstName, SecondName, Position) VALUES (?, ?, ?);"
+DISPLAY_DATA = "SELECT * FROM college WHERE Position = ?;"
+def connect():
+ return sqlite3.connect('databas.db')
+
+def create_table(connection):
+ with connection:
+ connection.execute(CREATE_TABLE)
+
+def insert_data(connection, FirstName, SecondName, Position):
+ with connection:
+ connection.execute(INSERT_DATA, (FirstName, SecondName, Position))
+
+def request_data(connection, position):
+ with connection:
+ return connection.execute(DISPLAY_DATA, (position,)).fetchall()
+
+api.py
+import database
+
+API_MENU = """
+--- Student Database ---
+
+1) Register new student
+2) Delete
+3) Show students
+
+4) Register new staff
+5) Delete
+6) Show staff
+
+7) >> Quit the program
+
+Your choice: """
+def main():
+ connection = database.connect()
+ database.create_table(connection)
+
+
+ while (choice := input(API_MENU)) != "7":
+ if choice == "1":
+ FirstName = "Julius"
+ SecondName = "Jessie"
+ Position = "Student"
+
+ database.insert_data(connection, FirstName, SecondName, Position)
+ elif choice == "2":
+ database.request_data(connection, "Student")
+ elif choice == "3":
+ pass
+ elif choice == "4":
+ pass
+ elif choice == "5":
+ pass
+ elif choice == "6":
+ pass
+ else:
+ print("Invalid input, please try again!")
+
+main()
+
+I am simply not being returned any data after I have registered the data using option 1. When running the code and choosing option 1, you should input a name, surname and position (Student / Staff) and then write that to the database (i have just defined strings for the purpose of putting the code here) and then choosing the option 2 should return the data that has "Student" as a position however it does not return any data.
","The problem was that I was simply negligent with actually using the function print() to show the data.
+database.request_data(connection, "Student")
+
+The line above should be:
+print(database.request_data(connection, "Student"))
+
",python
+"Count rows with multiple criteria in pandasI have a pandas dataframe with "user_ID", "datetime" and "action_type" columns like it is shown below and I want to get the last column (the last column = desired output) by performing some calculations:
+data = {'user_id': list('ddabdacddaaa'),
+ 'datetime':pd.date_range("20201001", periods=12, freq='H'),
+ 'action_type':list('XXXWZWKOOXWX'),
+ 'as_if_X_calculated':list('121021022223')
+ }
+df = pd.DataFrame(data)
+df
+
+ user_id datetime action_type as_if_X_calculated
+0 d 2020-10-01 00:00:00 X 1
+1 d 2020-10-01 01:00:00 X 2
+2 a 2020-10-01 02:00:00 X 1
+3 b 2020-10-01 03:00:00 W 0
+4 d 2020-10-01 04:00:00 Z 2
+5 a 2020-10-01 05:00:00 W 1
+6 c 2020-10-01 06:00:00 K 0
+7 d 2020-10-01 07:00:00 O 2
+8 d 2020-10-01 08:00:00 O 2
+9 a 2020-10-01 09:00:00 X 2
+10 a 2020-10-01 10:00:00 W 2
+11 a 2020-10-01 11:00:00 X 3
+
+So the last column shows how many times the user has performed an action X at the time of the current record. If we see a user "a", his results will be like 1-1-2-2-3 in chronological order. So how can I calculate the number of action X for the given user that happened at the time of the record or earlier?
+P.S. In Excel it would look like =countifs(A:A; A2; B:B; "<="&B2; C:C; "X") (Column A = "user_id")
","If your dataframe is sorted by datetime you can create a temporary column for the condition on action_type and use pd.expanding
+df.sort_values('datetime', inplace=True)
+df['dummy'] = df.action_type == 'X'
+df['X_calculated'] = (df.groupby('user_id')['dummy']
+ .expanding().sum()
+ .reset_index(level=0, drop=True)
+ .astype('int'))
+df.sort_index(inplace=True)
+print(df.drop('dummy', 1))
+assert df.as_if_X_calculated.astype('int').equals(df.X_calculated), 'X_calculated is not equal'
+
+Out:
+ user_id datetime action_type as_if_X_calculated X_calculated
+0 d 2020-10-01 00:00:00 X 1 1
+1 d 2020-10-01 01:00:00 X 2 2
+2 a 2020-10-01 02:00:00 X 1 1
+3 b 2020-10-01 03:00:00 W 0 0
+4 d 2020-10-01 04:00:00 Z 2 2
+5 a 2020-10-01 05:00:00 W 1 1
+6 c 2020-10-01 06:00:00 K 0 0
+7 d 2020-10-01 07:00:00 O 2 2
+8 d 2020-10-01 08:00:00 O 2 2
+9 a 2020-10-01 09:00:00 X 2 2
+10 a 2020-10-01 10:00:00 W 2 2
+11 a 2020-10-01 11:00:00 X 3 3
+
",python
+"How to compare date in python?How to check datetime in python3?
+Both date and time are same but in 'b' there is no zero.
+ a = '2020-09-08 05:09:02'
+ b = '2020-9-8 5:9:2'
+ if a == b:
+ print("yes")
+ else:
+ print("no")
+
+ Expected Output:
+ yes
+
","you can turn them into datetime objects and compare them
+import datetime as dt
+a = '2020-09-08 05:09:02'
+b = '2020-9-8 5:9:2'
+a_dt = dt.datetime.strptime(a, '%Y-%m-%d %H:%M:%S')
+b_dt = dt.datetime.strptime(b, '%Y-%m-%d %H:%M:%S')
+if a_dt == b_dt:
+ print("yes")
+else:
+ print("no")
+
",python
+"Python Pandas Dataframe Groupby Sum questionI'm new in Python and I need to combine 2 dataframe with 'id' as the primary key. I need to sum up all the Charges from df1 and df2.
+df1:
+[df1][1]
+
+id Name Charge
+1 A 100
+1 A 100
+2 B 200
+2 B 200
+5 C 300
+6 D 400
+
+df2:
+[df2][2]
+
+id Name Charge
+1 A 100
+1 A 100
+2 B 200
+8 X 200
+
+output:
+[output][3]
+
+id Name Charge(TOTAL from df1 & df2)
+1 A 400
+2 B 600
+5 C 300
+6 D 400
+8 X 200
+
","Try:
+pd.concat([df1, df2]).groupby(['id', 'Name'], as_index=False)['Charge'].sum()
+
+Output:
+ id Name Charge
+0 1 A 400
+1 2 B 600
+2 5 C 300
+3 6 D 400
+4 8 X 200
+
",python
+"Python Split first ""middle"" and last wordI would like to split a list in 3 parts. It works except when the middle part is in "two parts".
+file = open("/topladder/pr_top_fr","r")
+for line in file:
+ fields = line.split( )
+
+ pos1 = fields[0]
+ pos2 = fields[1]
+ pos3 = fields[2]
+
+ print("Position: " + pos1 + " - " + pos2 + " - Tr:" + pos3)
+
+the file look like:
+308 Mars 6249
+948 Ben Stark 6063
+955 Toto 6061
+
+And here is the result:
+Position: 308 - Mars - Tr:6249
+Position: 948 - Ben - Tr:Stark
+Position: 955 - Toto - Tr:6061
+
+Is it possible to "combine" everything that is in the middle in "pos2"?
+Thanks !!
","file = open("/topladder/pr_top_fr","r")
+for line in file:
+ fields = line.split( )
+
+ pos1 = fields[0]
+ pos2 = fields[1:-1]
+ pos3 = fields[-1]
+
+ print("Position: " + pos1 + " - " + ' '.join(pos2) + " - Tr:" + pos3)
+
",python
+"Django can't search ""method not allowed""im new to django and im currently doing a website for my friend. he wants me to make a system where the users can search the database and the website gives the relevent items according to their serial number.
+i followed a tutorial from the following site: https://learndjango.com/tutorials/django-search-tutorial to figure out how to do db searchs which helped a lot, but im still having a problem: my search bar works, and the result page also works but it only works when i manually type the query on the searchbar myself (e.x. results/?q=number1). However when i search using the input bar and the submit button it sends me to /results/ page and the page gives this:
+This page isn’t working
+If the problem continues, contact the site owner.
+HTTP ERROR 405
+-when i open up pycharm to see the error in terminal it says:
+Method Not Allowed (POST): /result/
+
+Method Not Allowed: /result/
+
+[27/Oct/2020 20:06:02] "POST /result/ HTTP/1.1" 405 0
+
+here are my codes(python3.7,pycharm) websites/urls:
+from . import views
+from django.urls import path
+from django.contrib.auth import views as auth_views
+
+urlpatterns = [
+ path('register/',views.UserFormView.as_view(), name='register'),
+ path('login/', auth_views.LoginView.as_view(), name='login'),
+ path('', views.IndexViews.as_view(), name='index'),
+ path('scan/', views.ScanView.as_view(), name='scan'),
+ path('result/', views.SearchResultsView.as_view(), name='result'),
+]
+
+websites/views:
+class IndexViews(generic.ListView):
+ template_name = "websites/index.html"
+ context_object_name = "object_list"
+
+ def get_queryset(self):
+ return Website.objects.all()
+
+
+class ScanView(TemplateView):
+ form_class = SerialFrom
+ template_name = 'websites/scan.html'
+
+
+class SearchResultsView(ListView):
+ model = SerialNumber
+ template_name = 'websites/result.html'
+
+ def get_queryset(self): # new
+ query = self.request.GET.get('q')
+ context = self.get_context_data(object=self.object)
+ object_list = SerialNumber.objects.filter(
+ Q(number__iexact=query)
+ )
+ return object_list
+
+scan.html:
+ {% extends 'websites/base.html' %}
+{% block albums_active %}active{% endblock %}
+
+{% block body %}
+<head>
+ <meta charset="UTF-8">
+ <title>Scan</title>
+ <link rel="stylesheet" href="style.css">
+</head>
+<body>
+ <form class="box" action="{% url 'result' %}" method="POST">
+ <h1>Product Check</h1>
+ <p> Please enter the serial id of your product to check it.</p>
+ {% csrf_token %}
+ <input type="text" name="q" placeholder="Serial Number">
+ <input type="submit" name="q" placeholder="Check">
+ </form>
+</body>
+{% endblock %}
+
+thank you for taking your time and reading, please help me i really need to do this.
","A ListView [Django-doc] by default does not implement a handler for a POST request. Searching is normally done through a GET request, so you should use:
+<form class="box" action="{% url 'result' %}" method="GET">
+ <h1>Product Check</h1>
+ <p> Please enter the serial id of your product to check it.</p>
+ <input type="text" name="q" placeholder="Serial Number">
+ <input type="submit" placeholder="Check">
+</form>
+Furthermore the <input type="submit"> should not have a name="q" attribute.
+As @Melvyn says, you can also alter the type to type="search" [mozilla] for the text box:
+<form class="box" action="{% url 'result' %}" method="GET">
+ <h1>Product Check</h1>
+ <p> Please enter the serial id of your product to check it.</p>
+ <input type="search" name="q" placeholder="Serial Number">
+ <input type="submit" placeholder="Check">
+</form>
",python
+"Creating a sphere at center of array without a for loop with meshgrid creates shell artifactI would like to implement the code below avoiding for loops to increase speed. Is there any way to do this so that I can create a sphere centered in the numpy array?
+def Create_Sphere(square_numpy_array, pixel_min, pixel_max, HU, Radius_sq ):
+
+
+new_array = np.empty_like(square_numpy_array)
+
+for k in range(pixel_min, pixel_max, 1):
+ for i in range(pixel_min, pixel_max, 1):
+ for j in range(pixel_min, pixel_max, 1):
+ r_sq = (i - 255)**2 + (j - 255)**2 + (k - 255)**2
+ if r_sq <= Radius_sq:
+ new_array[k, i, j] = HU + 1000
+return new_array
+
+Adopting the solution from the recommended link Python vectorizing nested for loops I was able to replace the code. I am getting unexplained artifacts in the final plot however. There are rings appearing around the central sphere. What could be causing these?
+def Create_Sphere_CT(HU=12):
+
+ radius = np.uint16(100) #mm
+ Radius_sq_pixels = np.uint16((radius *2)**2 )
+ sphere_pixel_HU = np.uint16(HU + 1000) #dtype controlled for memory
+ center_pixel = np.uint16(400/2-1)
+ new_array = np.zeros((400,400,400), dtype = np.uint16)
+
+ m,n,r = new_array.shape
+ x = np.arange(0, m, 1, dtype = np.uint16)
+ y = np.arange(0, n, 1, dtype = np.uint16)
+ z = np.arange(0, r, 1, dtype = np.uint16)
+
+ xx,yy,zz = np.meshgrid(x,y,z, indexing = 'ij')
+
+ X = (xx - center_pixel)
+ xx = None #free memory once variable is used
+ Y = (yy - center_pixel)
+ yy= None #free memory once variable is used
+ Z = (zz - center_pixel)
+ zz = None#free memory once variable is used
+
+ mask = (X**2 + Y**2 + Z**2) < Radius_sq_pixels #create sphere mask
+ new_array = sphere_pixel_HU * mask #assign values
+
+ return new_array
+
+This code give a sphere centered with some ring artifacts around
+![]()
","I realized that using unsigned int was causing errors in subtraction. The final working solution is below
+def Sphere(HU):
+num_pix = int(400)
+radius =100
+Radius_sq_pixels = (radius)**2
+sphere_pixel_HU = HU
+center_pixel = int(num_pix/2-1)
+new_array = np.zeros((num_pix, num_pix, num_pix))
+
+m,n,r = new_array.shape
+x = np.arange(0, m, 1)
+y = np.arange(0, n, 1)
+z = np.arange(0, r, 1)
+
+xx,yy,zz = np.meshgrid(x,y,z,indexing = 'ij',sparse = True)
+X = (xx - center_pixel)
+Y = (yy - center_pixel)
+Z = (zz - center_pixel)
+
+mask = ((X**2) + (Y**2) + (Z**2)) < Radius_sq_pixels #create sphere mask
+
+new_array = sphere_pixel_HU * mask #assign values
+new_array = new_array.astype(np.uint16) #change datatype
+
+plt.imshow(new_array[int(num_pix/2)])
+plt.show()
+
+return new_array
+
",python
+"Fill DataFrame NaN with another DataFrame with groupbyI am sure this has been answered before but I cannot seem to find the right solution. I have tried pd.merge, merge, combine_first and update and they all don't seem to get the right job. They either create a new variable with an _x or they imply stack in below. I am wishing to merge df1 into df where column c is missing values. I wish to do this for each id on each date
+Example df for task
+df
+ date id a b c d
+1/1/2000 1 10 20 10 11
+1/1/2000 2 11 21 NaN 11
+1/1/2000 3 15 20 NaN 11
+1/1/2000 4 12 24 13 11
+1/2/2000 1 10 25 10 11
+1/2/2000 2 10 20 NaN 15
+1/2/2000 3 10 26 NaN 11
+1/2/2000 4 10 20 16 13
+1/3/2000 1 10 20 10 11
+1/3/2000 2 10 20 NaN 11
+1/3/2000 3 10 20 NaN 11
+1/3/2000 4 10 20 10 11
+
+df1
+ date id c
+12/29/1999 2 1
+12/30/1999 3 1
+12/30/1999 2 1
+12/31/1999 3 1
+12/31/1999 2 1
+12/31/1999 4 1
+1/1/2000 2 1
+1/1/2000 3 14
+1/2/2000 2 13
+1/2/2000 3 22
+1/3/2000 2 13
+1/3/2000 3 18
+
+desired df after combining df and d1
+df
+ date id a b c d
+1/1/2000 1 10 20 10 11
+1/1/2000 2 11 21 1 11
+1/1/2000 3 15 20 14 11
+1/1/2000 4 12 24 13 11
+1/2/2000 1 10 25 10 11
+1/2/2000 2 10 20 13 15
+1/2/2000 3 10 26 22 11
+1/2/2000 4 10 20 16 13
+1/3/2000 1 10 20 10 11
+1/3/2000 2 10 20 13 11
+1/3/2000 3 10 20 18 11
+1/3/2000 4 10 20 10 11
+
","Lets create a MultiIndex in both the dataframe with id and date columns then use Series.fillna to fill the NaN values in column c of df1 from corresponding values in df2:
+df1['c'] = df1.set_index(['date', 'id'])['c']\
+ .fillna(df2.set_index(['id', 'date'])['c']).tolist()
+
+
+ date id a b c d
+0 1/1/2000 1 10 20 10.0 11
+1 1/1/2000 2 11 21 1.0 11
+2 1/1/2000 3 15 20 14.0 11
+3 1/1/2000 4 12 24 13.0 11
+4 1/2/2000 1 10 25 10.0 11
+5 1/2/2000 2 10 20 13.0 15
+6 1/2/2000 3 10 26 22.0 11
+7 1/2/2000 4 10 20 16.0 13
+8 1/3/2000 1 10 20 10.0 11
+9 1/3/2000 2 10 20 13.0 11
+10 1/3/2000 3 10 20 18.0 11
+11 1/3/2000 4 10 20 10.0 11
+
",python
+"How Do I Download the Latest Version of PyQt5 Tools?I want to download PyQt5 tools. I wrote the code from the pypi site, but it requests a version from me. He didn't accept what I wrote. I want to download the latest version, how can I do?
+pip install pyqt5-tools
+
+ERROR: Could not find a version that satisfies the requirement pyqt5-tools (from versions: none)
+
","Specify the version, for example for version 5.11 write the following command
+pip install pyqt5-tools~=5.11
",python
+"How to get host name from website using pythonI would like to look get the host name using requests repository in python. I tried to do this like that:
+pprint(requests.get("https://www.facebook.com/").headers['domain'])
+
+but it doesn't work. If it is possible in other repository I would be grateful for any answer.
","Based on Regular Expression - Extract subdomain & domain
+import requests
+import re
+
+p = re.compile("^(?:https?:\/\/)?(?:[^@\/\n]+@)?(?:www\.)?([^:\/?\n]+)")
+
+r = requests.get("https://www.google.com/")
+domain = p.match(r.url).group(1)
+print(domain)
+
",python
+"Desktop Goose like programi have been trying to create a python program similar to desktop goose in the aspect of a widget moving freely around the screen.
+First i tried to create a window with tkinter and make a transparent window
+with a PNG picture of a character and then moving the window with win32gui or any other library that would allow me to do this.
+But first, the tkinter transparent window thing doesn't work because the widgets inherit the transparency, so there is no way i can display the PNG.
+Then i had trouble finding any win32gui function that would allow me to move the windows, i just found stuff that would let me resize them.
+Is there any way i can do any of these two tasks?
","You can create a transparent window using a transparent PNG image as below:
+import tkinter as tk
+
+# select a color as the transparent color
+TRNAS_COLOR = '#abcdef'
+
+root = tk.Tk()
+root.overrideredirect(1)
+root.attributes('-transparentcolor', TRNAS_COLOR)
+
+image = tk.PhotoImage(file='/path/to/image.png')
+tk.Label(root, image=image, bg=TRNAS_COLOR).pack()
+
+# support dragging window
+
+def start_drag(event):
+ global dx, dy
+ dx, dy = event.x, event.y
+
+def drag_window(event):
+ root.geometry(f'+{event.x_root-dx}+{event.y_root-dy}')
+
+root.bind('<Button-1>', start_drag)
+root.bind('<B1-Motion>', drag_window)
+
+root.mainloop()
+
+Then you can use root.geometry(...) to move the root window.
",python
+"Matplotlib.pylot says 'requirement already satisfied' but also 'Module not found'This is a similar question to what I've asked before, but I don't understand what is happening. I am using pip 20.2.3 and python 3.8.2 on Windows.
+Before, when I typed pip install matplotlib or pip3 install matplotlib into cmd, I would get a message saying that all the "requirements are satisfied". But when I run import matplotlib.pylot as plt on VSCode, it tells me
+Traceback (most recent call last):
+ File "c:/Users/sound/Desktop/now/Trade Simulation/#2/Trade Simulation.py", line 6, in <module>
+ import matplotlib.pylot as plt
+ModuleNotFoundError: No module named 'matplotlib.pylot'
+
+Not only that, I recently just updated pip from 20.2.3 to 20.2.4. When I run the same command pip install matplotlib, instead of saying "requirements are satisfied" like before, I get a massive error message in red that looks like:
+ ERROR: Command errored out with exit status 1:
+ command: 'c:\users\sound\appdata\local\programs\python\python39\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\sound\\AppData\\Local\\Temp\\pip-install-9w_lvhnf\\matplotlib\\setup.py'"'"'; __file__='"'"'C:\\Users\\sound\\AppData\\Local\\Temp\\pip-install-9w_lvhnf\\matplotlib\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\sound\AppData\Local\Temp\pip-pip-egg-info-ihoi_gsj'
+ cwd: C:\Users\sound\AppData\Local\Temp\pip-install-9w_lvhnf\matplotlib\
+ Complete output (249 lines):
+ WARNING: The wheel package is not available.
+ ERROR: Command errored out with exit status 1:
+ command: 'c:\users\sound\appdata\local\programs\python\python39\python.exe' 'c:\users\sound\appdata\local\programs\python\python39\lib\site-packages\pip\_vendor\pep517\_in_process.py' prepare_metadata_for_build_wheel 'C:\Users\sound\AppData\Local\Temp\tmpsqdka5ne'
+
+The change in output when I updated pip kind of threw me off. I'm not sure what to do or how to fix this installation error. Any insight would be appreciated.
+Using pip3.8 install -U matplotlib gives me back
+ERROR: Could not install packages due to an EnvironmentError: [WinError 5] Access is denied: 'C:\\Users\\12132\\AppData\\Local\\Programs\\Python\\Python38-32\\Lib\\site-packages\\~atplotlib\\ft2font.cp38-win32.pyd'
+Consider using the `--user` option or check the permissions.
+
","Your error indicates a typo, there is no module pylot in matplotlib.
+The import you have:
+import matplotlib.pylot as plt
+should be:
+import matplotlib.pyplot as plt
",python
+"TypeError: a bytes-like object is required, not 'str' when reading and iterating over a file in Python3.xI'm trying to read and iterate over a text file in Python but I keep running into this error. I'm pretty new to Python and I'm not sure where the error in my code is:
+import socket
+
+def getBanner(ip, port):
+ try:
+ socket.setdefaulttimeout(2)
+ s = socket.socket()
+ s.connect((ip, port))
+ banner = s.recv(1024)
+ return banner
+ except:
+ return
+
+def vulnCheck(banner):
+ f = open("vuln_banners.txt", 'r')
+ for line in f.readlines():
+ if line.strip('\n') in banner:
+ print("[+] Server is vulnerable: "+banner.strip('\n'))
+ else:
+ print("[-] FTP Server is not vulnerable.")
+ return
+
+
+def main():
+ portList = [21, 22, 25, 80, 110, 443]
+ for x in range(1, 255):
+ ip = '192.168.1.' + str(x)
+ for port in portList:
+ banner = getBanner(ip, port)
+ if banner:
+ print("[+] " + str(ip) + ": " + str(banner))
+ vulnCheck(banner)
+
+if __name__ == '__main__':
+ main()
+
+I would like my program to iterate over a list of strings in a text file to compare against a banner, however, every time I run the code I get this error:
+ File "/home/testing/Documents/HelloWorld.py", line 17, in vulnCheck
+ if line.strip('\n') in banner:
+TypeError: a bytes-like object is required, not 'str'
+
+I've read few of the answers about this error and they talk about NOT opening the file as binary, but I don't think I'm doing that in my code. I'm simply trying to read each line in a text file and iterate over it. Can anyone point out the error in my code? Any help would be appreciated.
","Your socket returns bytes while the file contains strings. You need to map one to the other - probably decode bytes banners to strings.
+As an aside, repeatedly reading the same static text file is a significant bottleneck here. Unless the file is so huge that you should be using a database instead anyway, read it once, at the start.
+The following also has assorted bug fixes and stylistic updates; most importantly, don't return in the middle of the vulnerability check if the first vulerability wasn't matched. Notice also how functions generally just return a result, and leave it to the caller to decide e.g. whether to print something based on that.
+import socket
+
+def get_banner(ip, port):
+ try:
+ socket.setdefaulttimeout(2)
+ s = socket.socket()
+ s.connect((ip, port))
+ banner = s.recv(1024)
+ return banner.decode('utf-8')
+ except:
+ return
+
+def vuln_check(banner, vulnlist):
+ for line in vulnlist:
+ if line in banner:
+ return True
+ return False
+
+def read_vulns(filename):
+ with open(filename, 'r') as f:
+ return f.read().splitlines()
+
+def main():
+ vulns = read_vulns("vuln_banners.txt")
+
+ port_list = [21, 22, 25, 80, 110, 443]
+ for x in range(1, 255):
+ ip = '192.168.1.' + str(x)
+ for port in port_list:
+ banner = get_banner(ip, port)
+ if banner:
+ print("[+]", ip + ":", banner)
+ if vuln_check(banner, vulns):
+ print("[+] Server is vulnerable:", banner.strip('\n'))
+ else:
+ print("[-] FTP Server is not vulnerable.")
+
+if __name__ == '__main__':
+ main()
+
",python
+How to find total hits on DynamoDB tableI have a Lambda function that is hitting DynamoDB and getting values from there. I wanted to check if there is any way where I can see how many times lambda hits the DynamoDB table. I tried to see in CloudWatch but unable to find it.
,"There is no simple way to do it.
+You can use read/write capacity metrics for the table if the lambda is the only consumer of this table
+Under Dynamodb -> you table -> Metrics, or from CW Metrics
+![]()
+And calculate a measurement of the actions according to this topic:
+How are consumed read capacity units calculated in DynamoDB query
+Currently, AWS doesn't show these metrics that you asked for
",python
+"Output of a Nested for Loop - PythonI am trying to get the output of a nested for loop into an array or matrix. For instance, in the code example below, I want to have a (3 by 2)-matrix of the form:
+[[5 6],
+ [6 7],
+ [7 8]]
+
+But my code is giving out of bound error.
+import numpy as np
+
+num = [1,2,3]
+sep = [4, 5]
+M = np.zeros((3,2))
+for i in num:
+ for j in sep:
+ M[i, j] = i + j
+M
+
+However, I realized that changing the initialization to np.zeros((4,6)) seems to work but with some irrelevant cells. Can someone explain how this works or possibly how I can achieve this (3 by 2)-matrix.![]()
","You are using the values in your num and sep lists as indexes. You need to use indexes instead:
+import numpy as np
+
+num = [1,2,3]
+sep = [4, 5]
+M = np.zeros((3,2))
+for i_i,i in enumerate(num):
+ for i_j,j in enumerate(sep):
+ M[i_i, i_j] = i + j
+
+print(M)
+
+Output as required.
",python
+"Alpine.js +flatpickr Datetimepicker is not workingi really stucked in my project. I have a site with alpine.js where i want to render data to an element
+Everything is working perfect until the fact that my flatpickr is not shown up.
+The datepicker is working perfect. It seams, that x-html, x-text nor document.getElementById().innerHTML used in alpine.js is working ....
+<div x-data="app()" x-html="modalData" x-show="isOpenModal" id="test">
+ only a test
+ <input class="picker" />
+</div>
+
+......
+ <script>
+ const fp = flatpickr(".picker", {
+ locale: "at",
+ minDate: "1930-01",
+ maxDate: 'today',
+ enableTime: true,
+ time_24hr: true,
+ minTime: "07:00",
+ maxTime: "20:00",
+ dateFormat: "d.m.Y H:i",
+ disableMobile: "true",
+ static:false,
+ });
+
+ function app(){
+ return {
+ isOpenModal: true,
+ modalData: '<input class=" form-control placeholder-primary-500 picker">',
+ }
+ }
+
+
+in this very simple example 2 input field are shown up, but only the second shows the flatpickr.
+Try:
+
+- If i delete the second the first will be not working.
+- x-text instead of x-html brings only the text <input ..... >
+- on the other hand without alpine.js it is working
+
+<script>
+ const test = document.getElementById('test').innerHTML = '<input class="picker" />';
+ const fp = flatpickr(".picker", {
+ locale: "at",
+ minDate: "1930-01",
+ maxDate: 'today',
+ enableTime: true,
+ time_24hr: true,
+ minTime: "07:00",
+ maxTime: "20:00",
+ dateFormat: "d.m.Y H:i",
+ disableMobile: "true",
+ static:false,
+ });
+ </script>
+
+UPDATE 30.10.20:
+I simplified the code, is still not working but why ?
+<div x-data="test()">
+ <button @click="show = true"> Klick </button>
+ <div x-show="show" x-model="daten" x-html="daten">
+ <input class="bg-green-500 picker" />
+</div>
+
+it is shown up correctly, flatpickr is initialized but the picker is not shown up.
+ function test() {
+ return {
+ daten:'<input class="bg-red-500 picker" />',
+ show: false,
+ }
+}
+
+such a simple code and not working :(
+I hope you understand my confusing special problem.
+Thanks for helping,
+Greets Martin
","The issue here is initializing the flatpickr. if you add it on the init hook of the alphine component it works perfectly. so when alphine component initializes the code segment in init hook will be executed.
+So to solve your issue,
+ <div x-data="app()" x-init="initFlatpickr">
+ <input x-ref="picker" />
+ </div>
+
+
+and in the script tag,
+<script>
+ function app() {
+ return {
+ initFlatpickr() {
+ const fp = flatpickr(this.$refs.picker, {
+ locale: "at",
+ minDate: "1930-01",
+ maxDate: "today",
+ enableTime: true,
+ time_24hr: true,
+ minTime: "07:00",
+ maxTime: "20:00",
+ dateFormat: "d.m.Y H:i",
+ disableMobile: "true",
+ static: false,
+ });
+ }
+ }
+ }
+</script>
+
+
+Now the initFlatpickr function will execute when the alphine js component is initialized.
+I have made use of refs which is a helpful alternative to setting ids and using document.querySelector all over the place.
+check out the docs for more detail.
",python
+"psycopg2.ProgrammingError: can't adapt type 'set'There is this discord bot. I wan't to create for my server. But i used psycopg2 because i wan't it to create a list of projects ideas. And i got this error psycopg2.ProgrammingError: can't adapt type 'set' there is my code:
+import psycopg2
+import os
+from discord.ext import commands
+import discord
+
+conn = psycopg2.connect(
+ host="ec2-3-216-92-193.compute-1.amazonaws.com",
+ database="d1vpende403347",
+ user="hnzgmwsoiogmmt",
+ password="86bfca0c982e04ae0ca6e6f4d1f4fb03ca5f4f4cb9a911672fa993a300e7ea0e",
+ port=5432)
+bot=discord.Client()
+bot=commands.Bot(command_prefix="!")
+cursor=conn.cursor()
+
+@bot.command()
+async def rules(ctx):
+ await ctx.send("1. To enter our server, you have of course to learn python and some libraries, for example: 'discord.py', 'django'.\n2. No insults or bad words.\n3.Enjoy it.")
+@bot.command()
+async def new(ctx, author:discord.Member, project):
+ name=author.display_name
+ await ctx.send(f"There is your entry: The author: {name} , project: {project}")
+ cursor.execute(f"INSERT INTO ideas(author,project) VALUES(%s,%s)",({name},{project}))
+ await ctx.send("The entry was succesfully added""")
+
+print("And There it is, you're connected")
+
+TOKEN=os.environ.get("DISCORD_BOT_SECRET")
+bot.run(TOKEN)
+
+cursor.close()
+conn.close()
+
+thx for the response
","import psycopg2
+import os
+from discord.ext import commands
+import discord
+
+conn = psycopg2.connect(
+ host="ec2-3-216-92-193.compute-1.amazonaws.com",
+ database="d1vpende403347",
+ user="hnzgmwsoiogmmt",
+ password="86bfca0c982e04ae0ca6e6f4d1f4fb03ca5f4f4cb9a911672fa993a300e7ea0e",
+ port=5432)
+bot=discord.Client()
+bot=commands.Bot(command_prefix="!")
+cursor=conn.cursor()
+
+@bot.command()
+async def rules(ctx):
+ await ctx.send("1. To enter our server, you have of course to learn python and some libraries, for example: 'discord.py', 'django'.\n2. No insults or bad words.\n3.Enjoy it.")
+@bot.command()
+async def new(ctx, author:discord.Member, project):
+ name=author.display_name
+ await ctx.send(f"There is your entry: The author: {name} , project: {project}")
+ cursor.execute(f"INSERT INTO ideas(author,project) VALUES(%s,%s)",(name,project))
+ await ctx.send("The entry was succesfully added""")
+
+print("And There it is, you're connected")
+
+TOKEN=os.environ.get("DISCORD_BOT_SECRET")
+bot.run(TOKEN)
+
+cursor.close()
+conn.close()
+
",python
+"Disable unique constraint on Django's ManyToManyFieldFrom the Django documentation for ManyToManyField:
+
+If you don’t want multiple associations between the same instances, add a UniqueConstraint including the from and to fields. Django’s automatically generated many-to-many tables include such a constraint.
+
+How do I disable this constraint? Is the only way to provide an explicit through table? Or is there a way to tell Django to not add the UniqueConstraint to the generated many-to-many-table?
","m2m tables are linked by an intermediary table which manages the relationships and has a unique_together constraint. This table is automatically created but you can use your own table as the doc says with the through argument. You can read about it in the docs. In your case you need disable the unique_together constraint by defining your own intermediary table.
+Other options are:
+
+- Extend the default
ManyToManyField and override the contribute_to_class method.
+- Edit the default
ManyToManyField directly as @Melvyn mentioned.
+
",python
+"Python 3 - reading utf-8 encoded csv into pandasI am trying to load my utf-8 encoded csv file with data from Twitter (in polish language) into pandas dataframe in Python 3.
+This is a piece of my csv:
+2020-03-28 20:26:57,"b'Oj b\xc4\x99dzie impreza, oj b\xc4\x99dzie. #WyboryPrezydenckie2020 #Wybory2020 #Wybory\xc5\x9amierci'"
+2020-03-28 20:26:41,"b'Skoro Prezydent ju\xc5\xbc mi\xc4\x99dzy wierszami przemyca, \xc5\xbce wybory mog\xc4\x85 by\xc4\x87 prze\xc5\x82o\xc5\xbcone, to nale\xc5\xbcy czyta\xc4\x87, \xc5\xbce wybory b\xc4\x99d\xc4\x85 prze\xc5\x82o\xc5\xbcone, a i pewnie zostanie to poprzedzone kwiecistym or\xc4\x99dziem Prezydenta w pelerynie zbawcy narodu. #koronowiruswpolsce #WyboryPrezydenckie2020'"
+2020-03-28 20:24:50,"b'Idea i miara. Pomoc wyborc\xc3\xb3w i narodu g\xc5\x82osuj\xc4\x85cego dla medycyny przez #podatek_dla_demokracji, 360 mln z\xc5\x82 na subwencje dla partii i na #WyboryPrezydenckie2020 #Wybory2020 #wybory. STOP-dla-Subwencji dla partii i na wybory z mixu podatkowego.\n@tvp_info\n@Cyfrowy_Polsat\n@tvn24\n#POPiS'"
+
+I was trying to load it this way:
+df = pd.read_csv('WyboryPrezydenckie2020.csv', names=["date", "tweet"], encoding='utf-8')
+
+but result looked like that:
+
+ date tweet
+0 2020-03-28 20:26:57 b'Oj b\xc4\x99dzie impreza, oj b\xc4\x99dzie. ...
+1 2020-03-28 20:26:41 b'Skoro Prezydent ju\xc5\xbc mi\xc4\x99dzy wie...
+2 2020-03-28 20:24:50 b'Idea i miara. Pomoc wyborc\xc3\xb3w i narodu...
+3 2020-03-28 20:22:34 b'RT @wkrawcz1: Kandydat @szymon_holownia m\x...
+4 2020-03-28 20:22:03 b'RT @wkrawcz1: Kandydat @szymon_holownia m\x...
+
+and it seems that my tweets weren't decoded at all. For example the first tweet should look like that:
+Oj będzie impreza, oj będzie. #WyboryPrezydenckie2020 #Wybory2020 #WyboryŚmierci
+
+How can I solve this issue?
","You've got strings for bytes (for some reason). To read it properly you need:
+
+- evaluate strings to bytes
+- decode unicode bytes to strings:
+
+
+from ast import literal_eval
+df = pd.read_csv('WyboryPrezydenckie2020.csv', names=["date", "tweet"], converters={"tweet":lambda x:literal_eval(x).decode("utf8")})
+print(df)
+ date tweet
+0 2020-03-28 20:26:57 Oj będzie impreza, oj będzie. #WyboryPrezydenc...
+1 2020-03-28 20:26:41 Skoro Prezydent już między wierszami przemyca,...
+2 2020-03-28 20:24:50 Idea i miara. Pomoc wyborców i narodu głosując...
+
",python
+"SyntaxError File ""import.py"", line 14def division_title(initial):
+ n = initial.split()
+ if len(n) == 3 :
+ return n
+ else return [n[0],None,n[1]]
+
+SyntaxError: invalid syntax
+error message
","You need to write it as follows.
+def division_title(initial):
+ n = initial.split()
+ if len(n) == 3:
+ return n
+
+
",python
+"Format string with special charsBelow code throws KeyError. Any ideas please? I tried doubling the braces but still no luck.
+v = "My Name='{x[1].name}'"
+p = "x[1].name"
+pv = 'test'
+v = v.format(p=pv)
+print(v)
+
+I also do not want to create another variable and wanted to work formatting on v variable.
+expected output
+My Name='test'
+
","If you really must use {x[1].name} as a format marker, you can create a suitable object to go in the place of x.
+v = "My Name='{x[1].name}'"
+
+class Foo:
+ name = 'test'
+
+print(v.format(x=[Foo,Foo]))
+
+Output:
+My Name='test'
+
+Here x is a list, x[1] is the class Foo, and x[1].name is the string 'test', as required.
",python
+"Change date in Dataframe when midnight reachedI've got a Dataframe with a column 'Date_and_time' which is in datetime format. Unfortunately, when midnight is reached (line 15235: 2020-08-02 00:00:00.000000), the date doesn't change accordingly. So, 2020-08-02 00:00:00.000000 should go to 2020-08-03 00:00:00.00000 when midnight (00:00:00.000000) is reached. On line 15235 midnight is reached, but the date doesn't change until line 16000. How can I change this so the date is correct when midnight is reached? Thank you.
+#Load file
+df = pd.read_csv(file, sep=";", names=["Date", "Time", "ID1","ID2","ID3","MP","ET"], skiprows = 1 ,float_precision='round_trip')
+df1 = df['Time'].str.split(expand=True)
+
+#Use columns 'Date' and 'Time' to create column 'Date_and_time' in datetime format
+df['Date_and_time'] = (pd.to_datetime(df['Date']) +
+ pd.to_timedelta(df1[0]) +
+ pd.to_timedelta(df1[1].str.replace('ms','').astype(int), unit='us'))
+
+Out[45]:
+ Date ... Date_and_time
+0 2020/08/02 ... 2020-08-02 21:21:46.000000
+1 2020/08/02 ... 2020-08-02 21:21:46.082191
+2 2020/08/02 ... 2020-08-02 21:21:46.164383
+3 2020/08/02 ... 2020-08-02 21:21:46.246575
+4 2020/08/02 ... 2020-08-02 21:21:46.328767
+ ... ... ...
+15235 2020/08/02 ... 2020-08-02 00:00:00.000000
+15236 2020/08/02 ... 2020-08-02 00:00:00.082191
+15237 2020/08/02 ... 2020-08-02 00:00:00.164383
+15238 2020/08/02 ... 2020-08-02 00:00:00.246575
+ ... ... ...
+16000 2020/08/03 ... 2020-08-03 00:00:16.082191
+ ... ... ...
+330404 2020/08/03 ... 2020-08-03 03:00:33.000000
+330405 2020/08/03 ... 2020-08-03 03:00:33.040513
+330406 2020/08/03 ... 2020-08-03 03:00:33.081026
+330407 2020/08/03 ... 2020-08-03 03:00:33.121539
+330408 2020/08/03 ... 2020-08-03 03:00:33.162052
+
+[330409 rows x 8 columns]
+
","This is essentially another question requiring the "diff-cumsum" trick to accumulate the number of negative changes. In this case, however, .diff() does not support datetime difference so it would be more tricky to do.
+Here is a quick and dirty showcase on df["Date_and_time"]. You should do the similar on the other columns involved by yourself.
+from datetime import timedelta
+
+#df = pd.read_clipboard(sep=r"\s{2,}")
+df["Date_and_time"] = pd.to_datetime(df["Date_and_time"])
+# get timestamp in nanoseconds
+df["ns"] = df["Date_and_time"].values.astype(np.int64)
+# detect reversed time change and accumulate days
+df["days"] = (df["ns"].diff() < 0).cumsum()
+# add the days found
+df["Date_and_time_new"] = df.apply(lambda row: row["Date_and_time"] + timedelta(days=row["days"]), axis=1)
+
+df
+Out[76]:
+ Date Date_and_time ... days Date_and_time_new
+0 2020/08/02 2020-08-02 21:21:46.000000 ... 0 2020-08-02 21:21:46.000000
+1 2020/08/02 2020-08-02 21:21:46.082191 ... 0 2020-08-02 21:21:46.082191
+2 2020/08/02 2020-08-02 21:21:46.164383 ... 0 2020-08-02 21:21:46.164383
+3 2020/08/02 2020-08-02 21:21:46.246575 ... 0 2020-08-02 21:21:46.246575
+4 2020/08/02 2020-08-02 21:21:46.328767 ... 0 2020-08-02 21:21:46.328767
+5 2020/08/02 2020-08-02 00:00:00.000000 ... 1 2020-08-03 00:00:00.000000
+6 2020/08/02 2020-08-02 00:00:00.082191 ... 1 2020-08-03 00:00:00.082191
+7 2020/08/02 2020-08-02 00:00:00.164383 ... 1 2020-08-03 00:00:00.164383
+8 2020/08/02 2020-08-02 00:00:00.246575 ... 1 2020-08-03 00:00:00.246575
+9 2020/08/03 2020-08-03 00:00:16.082191 ... 1 2020-08-04 00:00:16.082191
+10 2020/08/03 2020-08-03 03:00:33.000000 ... 1 2020-08-04 03:00:33.000000
+11 2020/08/03 2020-08-03 03:00:33.040513 ... 1 2020-08-04 03:00:33.040513
+12 2020/08/03 2020-08-03 03:00:33.081026 ... 1 2020-08-04 03:00:33.081026
+13 2020/08/03 2020-08-03 03:00:33.121539 ... 1 2020-08-04 03:00:33.121539
+14 2020/08/03 2020-08-03 03:00:33.162052 ... 1 2020-08-04 03:00:33.162052
+[15 rows x 5 columns]
+
",python
+"Given list of strings return just strings not contained in any otherI'm trying to find a computationally friendly way for doing the following:
+given list_of_strings = ['many', 'man', 'cat', 'caterpillar', 'pillow', 'pi', 'pill']
+return sublist = ['many', 'caterpillar', 'pillow'], i.e. the list of strings not contained in any other string.
+The simplest solution would be to iterate over the elements and check whether each element is contained in the other, with O(n^2) complexity (even if there are some little optimizations I thought about, like sorting the strings by length, but this adds the sorting complexity), but I think it's too expensive.
+I thought also about implementing a trie and then use each element of the list as an haystack, then keeping just the strings contained in one haystack (that of the string itself).
+I think I'm missing more than I know, so I'm looking for suggestions
","I've implemented the following (but I'm eager to see other approaches before accepting any answer):
+import ahocorasick
+
+def make_automation(list_of_tokens):
+ A = ahocorasick.Automaton()
+ for idx, key in enumerate(list_of_tokens):
+ A.add_word(key, (idx, key))
+ A.make_automaton()
+ return A
+
+def search_list_aho_corasick(txt_to_search, ahocorasick_automaton):
+ match = []
+ for end_index, (insert_order, original_value) in ahocorasick_automaton.iter(txt_to_search):
+ start_index = end_index - len(original_value) + 1
+ if original_value == txt_to_search:
+ continue
+ else:
+ match.append(original_value)
+ assert txt_to_search[start_index:start_index + len(original_value)] == original_value
+ return match
+
+def filter_matches(list_of_matches):
+ if len(list_of_matches) == 0:
+ return list_of_matches
+ else:
+ aut = make_automation(list_of_matches)
+ contained = []
+ contained_result = list(map(lambda x: search_list_aho_corasick(x, aut), list_of_matches))
+ return list(set(list_of_matches) - set(flatten_list(contained_result)))
+
+So in our case:
+result = filter_matches(list_of_strings)
+
+It uses ahocorasick package, which implements Aho-Corasick algorithm for string search.
+Compared to @IoaTzimas answers it drastically reduces computation time: I've backtested both on a 40k list and it went from roughly 330seconds to less than a second
",python
+"Numpy doesn't throw FloatingPointError for dot productI am taking the dot-product of two numpy arrays (both float32). I am deliberately setting numpy to raise a FloatingPointError in case of overflow. However, dot does not behave as expected: instead of raising a FloatingPointError as it does when using ordinary multiplication, dot returns inf.
+Is this the intended behavior? Is there a way to force dot to raise an exception as well?
+Minimal working example:
+import numpy as np
+
+np.seterr(over="raise")
+
+x = np.array([2e+38], dtype=np.float32)
+y = np.array([10], dtype=np.float32)
+
+x * y
+>>> FloatingPointError: overflow encountered in multiply
+
+np.dot(x,y)
+>>> inf
+
","Accepting @hpaulj's comment as the answer – matmul does throw a FloatingPointError as required, while dot doesn't (no idea why). For my purpose matmul gives the same result as dot.
",python
+"Creating masks for duplicate elements in Tensorflow/KerasI am trying to write a custom loss function for a person-reidentification task which is trained in a multi-task learning setting along with object detection. The filtered label values are of the shape (batch_size, num_boxes). I would like to create a mask such that only the values which repeat in dim 1 are considered for further calculations. How do I do this in TF/Keras-backend?
+Short Example:
+Input labels = [[0,0,0,0,12,12,3,3,4], [0,0,10,10,10,12,3,3,4]]
+Required output: [[0,0,0,0,1,1,1,1,0],[0,0,1,1,1,0,1,1,0]]
+
+(Basically I want to filter out only duplicates and discard unique identities for the loss function).
+I guess a combination of tf.unique and tf.scatter could be used but I do not know how.
","This code works:
+x = tf.constant([[0,0,0,0,12,12,3,3,4], [0,0,10,10,10,12,3,3,4]])
+def mark_duplicates_1D(x):
+ y, idx, count = tf.unique_with_counts(x)
+ comp = tf.math.greater(count, 1)
+ comp = tf.cast(comp, tf.int32)
+ res = tf.gather(comp, idx)
+ mult = tf.math.not_equal(x, 0)
+ mult = tf.cast(mult, tf.int32)
+ res *= mult
+ return res
+res = tf.map_fn(fn=mark_duplicates_1D, elems=x)
+
",python
+"How do I split up thresholds into squares in OpenCV2?I have a picture of a lovely Rubiks cube:
+![]()
+I want to split it into squares and identify the colour of each square. I can run a Guassian Blur on it, followed by 'Canny' before ending on 'Dilate' in order to get the following:
+![]()
+This visibly looks good, but I'm unable to turn it into squares. Any sort of 'findContours' I try brings up only one or two squares. Nowhere near the nine I'm aiming for. Do people have any ideas on what I can do beyond this?
+Current best solution:
+![]()
+The code is below and requires numpy + opencv2. It expects a file called './sides/rubiks-side-F.png' and outputs several files to a 'steps' folder.
+import numpy as np
+import cv2 as cv
+
+def save_image(name, file):
+ return cv.imwrite('./steps/' + name + '.png', file)
+
+
+def angle_cos(p0, p1, p2):
+ d1, d2 = (p0-p1).astype('float'), (p2-p1).astype('float')
+ return abs(np.dot(d1, d2) / np.sqrt(np.dot(d1, d1)*np.dot(d2, d2)))
+
+def find_squares(img):
+ img = cv.GaussianBlur(img, (5, 5), 0)
+ squares = []
+ for gray in cv.split(img):
+ bin = cv.Canny(gray, 500, 700, apertureSize=5)
+ save_image('post_canny', bin)
+ bin = cv.dilate(bin, None)
+ save_image('post_dilation', bin)
+ for thrs in range(0, 255, 26):
+ if thrs != 0:
+ _retval, bin = cv.threshold(gray, thrs, 255, cv.THRESH_BINARY)
+ save_image('threshold', bin)
+ contours, _hierarchy = cv.findContours(
+ bin, cv.RETR_LIST, cv.CHAIN_APPROX_SIMPLE)
+ for cnt in contours:
+ cnt_len = cv.arcLength(cnt, True)
+ cnt = cv.approxPolyDP(cnt, 0.02*cnt_len, True)
+ if len(cnt) == 4 and cv.contourArea(cnt) > 1000 and cv.isContourConvex(cnt):
+ cnt = cnt.reshape(-1, 2)
+ max_cos = np.max(
+ [angle_cos(cnt[i], cnt[(i+1) % 4], cnt[(i+2) % 4]) for i in range(4)])
+ if max_cos < 0.2:
+ squares.append(cnt)
+ return squares
+
+img = cv.imread("./sides/rubiks-side-F.png")
+squares = find_squares(img)
+cv.drawContours(img, squares, -1, (0, 255, 0), 3)
+save_image('squares', img)
+
+You can find other sides here
","I know that you might not accept this answer because it is written in C++. That's ok; I just want to show you a possible approach for detecting the squares. I'll try to include as much detail as possible if you wish to port this code to Python.
+The goal is to detect all 9 squares, as accurately as possible. These are the steps:
+
+- Get an edge mask where the outline of the complete cube is
+clear and visible.
+- Filter these edges to get a binary cube (segmentation) mask.
+- Use the cube mask to get the cube’s bounding box/rectangle.
+- Use the bounding rectangle to get the dimensions and location of
+each square (all the squares have constant dimensions).
+
+First, I'll try to get an edges mask applying the steps you described. I just want to make sure I get to a similar starting point like where you currently are.
+The pipeline is this: read the image > grayscale conversion > Gaussian Blur > Canny Edge detector:
+ //read the input image:
+ std::string imageName = "C://opencvImages//cube.png";
+ cv::Mat testImage = cv::imread( imageName );
+
+ //Convert BGR to Gray:
+ cv::Mat grayImage;
+ cv::cvtColor( testImage, grayImage, cv::COLOR_RGB2GRAY );
+
+ //Apply Gaussian blur with a X-Y Sigma of 50:
+ cv::GaussianBlur( grayImage, grayImage, cv::Size(3,3), 50, 50 );
+
+ //Prepare edges matrix:
+ cv::Mat testEdges;
+
+ //Setup lower and upper thresholds for edge detection:
+ float lowerThreshold = 20;
+ float upperThreshold = 3 * lowerThreshold;
+
+ //Get Edges via Canny:
+ cv::Canny( grayImage, testEdges, lowerThreshold, upperThreshold );
+
+Alright, this is the starting point. This is the edges mask I get:
+
+Close to your results. Now, I'll apply a dilation. Here, the number of iterations of the operation is important, because I want nice, thick edges. Closing opened contours is also desired, so, I want an mild-aggressive dilation. I set the number of iterations = 5 using a rectangular structuring element.
+ //Prepare a rectangular, 3x3 structuring element:
+ cv::Mat SE = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(3, 3) );
+
+ //OP iterations:
+ int dilateIterations = 5;
+
+ //Prepare the dilation matrix:
+ cv::Mat binDilation;
+
+ //Perform the morph operation:
+ cv::morphologyEx( testEdges, binDilation, cv::MORPH_DILATE, SE, cv::Point(-1,-1), dilateIterations );
+
+I get this:
+
+This is the output so far with nice and very defined edges. The most important thing is to clearly define the cube, because I'll rely on its outline to compute the bounding rectangle later.
+What follows is my attempt to clean the cube's edges from everything else as accurately as possible. There's a lot of garbage and pixels that do not belong to the cube, as you can see. I'm especially interested on flood-filling the background with a color (white) different from the cube (black) in order to get a nice segmentation.
+Flood-filling has a disadvantage, though. It can also fill the interior of a contour if it is not closed. I try to clean garbage and close contours in one go with a "border mask", which are just white lines at the side of the dilation mask.
+I implement this mask as four SUPER THICK lines that border the dilation mask. To apply the lines I need starting and ending points, which correspond to the image corners. These are defined in a vector:
+ std::vector< std::vector<cv::Point> > imageCorners;
+ imageCorners.push_back( { cv::Point(0,0), cv::Point(binDilation.cols,0) } );
+ imageCorners.push_back( { cv::Point(binDilation.cols,0), cv::Point(binDilation.cols, binDilation.rows) } );
+ imageCorners.push_back( { cv::Point(binDilation.cols, binDilation.rows), cv::Point(0,binDilation.rows) } );
+ imageCorners.push_back( { cv::Point(0,binDilation.rows), cv::Point(0, 0) } );
+
+Four starting/ending coordinates in a vector of four entries. I apply the "border mask" looping through these coordinates and drawing the thick lines:
+ //Define the SUPER THICKNESS:
+ int lineThicness = 200;
+
+ //Loop through my line coordinates and draw four lines at the borders:
+ for ( int c = 0 ; c < 4 ; c++ ){
+ //Get current vector of points:
+ std::vector<cv::Point> currentVect = imageCorners[c];
+ //Get the starting/ending points:
+ cv::Point startPoint = currentVect[0];
+ cv::Point endPoint = currentVect[1];
+ //Draw the line:
+ cv::line( binDilation, startPoint, endPoint, cv::Scalar(255,255,255), lineThicness );
+ }
+
+Cool. This gets me this output:
+
+Now, let's apply the floodFill algorithm. This operation will fill a closed area of same colored pixels with a "substitute" color. It needs a seed point and the substitute color (white in this case). Let's Flood-fill at the four corners inside of the white mask we just created.
+ //Set the offset of the image corners. Ensure the area to be filled is black:
+ int fillOffsetX = 200;
+ int fillOffsetY = 200;
+ cv::Scalar fillTolerance = 0; //No tolerance
+ int fillColor = 255; //Fill color is white
+
+ //Get the dimensions of the image:
+ int targetCols = binDilation.cols;
+ int targetRows = binDilation.rows;
+
+ //Flood-fill at the four corners of the image:
+ cv::floodFill( binDilation, cv::Point( fillOffsetX, fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);
+ cv::floodFill( binDilation, cv::Point( fillOffsetX, targetRows - fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);
+ cv::floodFill( binDilation, cv::Point( targetCols - fillOffsetX, fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);
+ cv::floodFill( binDilation, cv::Point( targetCols - fillOffsetX, targetRows - fillOffsetY ), fillColor, (cv::Rect*)0, fillTolerance, fillTolerance);
+
+This can also be implemented as a loop, just like the "border mask". After this operation I get this mask:
+
+Getting close, right? Now, depending on your image, some garbage could survive all these "cleaning" operations. I'd suggest applying an area filter. The area filter will remove every blob of pixels that is under a threshold area. This is useful, because the cube's blobs are the biggest blobs on the mask and those surely will survive the area filter.
+Anyway, I'm just interested on the cube's outline; I don't need those lines inside the cube. I'm going to dilate the hell out of the (inverted) blob and then erode back to original dimensions to get rid of the lines inside the cube:
+ //Get the inverted image:
+ cv::Mat cubeMask = 255 - binDilation;
+
+ //Set some really high iterations here:
+ int closeIterations = 50;
+
+ //Dilate
+ cv::morphologyEx( cubeMask, cubeMask, cv::MORPH_DILATE, SE, cv::Point(-1,-1), closeIterations );
+ //Erode:
+ cv::morphologyEx( cubeMask, cubeMask, cv::MORPH_ERODE, SE, cv::Point(-1,-1), closeIterations );
+
+This is a closing operation. And a pretty brutal one, this is the result of applying it. Remember I previously inverted the image:
+
+Isn't that nice or what? Check out the cube mask, here overlaid into the original RBG image:
+
+Excellent, now let's get the bounding box of this blob. The approach is as follows:
+Get blob contour > Convert contour to bounding box
+
+This is fairly straightforward to implement, and the Python equivalent should be very similar to this. First, get the contours via findContours. As you see, there should be only one contour: the cube outline. Next, convert the contour to a bounding rectangle using boundingRect. In C++ this is the code:
+ //Lets get the blob contour:
+ std::vector< std::vector<cv::Point> > contours;
+ std::vector<cv::Vec4i> hierarchy;
+
+ cv::findContours( cubeMask, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, cv::Point(0, 0) );
+
+ //There should be only one contour, the item number 0:
+ cv::Rect boundigRect = cv::boundingRect( contours[0] );
+
+These are the contours found (just one):
+
+Once you convert this contour to a bounding rectangle, you can get this nice image:
+
+Ah, we are very close to the end here. As all the squares have the same dimensions and your image doesn’t seem to be very perspective-distorted, we can use the bounding rectangle to estimate the square dimensions. All the squares have the same width and height, there are 3 squares per cube width and 3 squares per cube height.
+Divide the bounding rectangle in 9 equal sub-squares (or, as I call them, "grids") and get their dimensions and location starting from the coordinates of the bounding box, like this:
+ //Number of squares or "grids"
+ int verticalGrids = 3;
+ int horizontalGrids = 3;
+
+ //Grid dimensions:
+ float gridWidth = (float)boundigRect.width / 3.0;
+ float gridHeight = (float)boundigRect.height / 3.0;
+
+ //Grid counter:
+ int gridCounter = 1;
+
+ //Loop thru vertical dimension:
+ for ( int j = 0; j < verticalGrids; ++j ) {
+
+ //Grid starting Y:
+ int yo = j * gridHeight;
+
+ //Loop thru horizontal dimension:
+ for ( int i = 0; i < horizontalGrids; ++i ) {
+
+ //Grid starting X:
+ int xo = i * gridWidth;
+
+ //Grid dimensions:
+ cv::Rect gridBox;
+ gridBox.x = boundigRect.x + xo;
+ gridBox.y = boundigRect.y + yo;
+ gridBox.width = gridWidth;
+ gridBox.height = gridHeight;
+
+ //Draw a rectangle using the grid dimensions:
+ cv::rectangle( testImage, gridBox, cv::Scalar(0,0,255), 5 );
+
+ //Int to string:
+ std::string gridCounterString = std::to_string( gridCounter );
+
+ //String position:
+ cv::Point textPosition;
+ textPosition.x = gridBox.x + 0.5 * gridBox.width;
+ textPosition.y = gridBox.y + 0.5 * gridBox.height;
+
+ //Draw string:
+ cv::putText( testImage, gridCounterString, textPosition, cv::FONT_HERSHEY_SIMPLEX,
+ 1, cv::Scalar(255,0,0), 3, cv::LINE_8, false );
+
+ gridCounter++;
+
+ }
+
+ }
+
+Here, for each grid, I'm drawing its rectangle and a nice number at the center of it. The draw rectangle function requires a defined rectangle: Upper left starting coordinates and rectangle width and height, which are defined using the gridBox variable of cv::Rect type.
+Here's a cool animation of how the cube gets divided into 9 grids:
+
+Here’s the final image!
+
+Some suggestions:
+
+- Your source image is way too big, try resizing it to a smaller size, operate
+on it and scale back the results.
+- Implement the area filter. It is very handy in getting rid of small
+blobs of pixels.
+- Depending on your images (I just tested the once you posted in your
+question) and the perspective distortion introduced by the camera, a
+simple
contour to boundingRect might not be enough. In that case,
+another approach would be to get the four points of the cube outline
+via Hough line detection.
+
",python
+"Ansible Runner can't execute playbookI am trying execute an ansible playbook inside a Flask Python project using ansible runner but upon execution, I get the following error: The command was not found or was not executable: ansible-playbook.
+The app runs in a docker container inside directory /app.
+Code:
+ r = ansible_runner.run(private_data_dir='/app/flask/ansible', playbook='project/playbook.yml')
+ app.logger.info("{}: {}".format(r.status, r.rc))
+ # successful: 0
+ for each_host_event in r.events:
+ app.logger.info(each_host_event['event'])
+ app.logger.info("Final status:")
+ app.logger.info(r.stats)
+
+This is the project tree:
+.
+├── README.md
+├── ansible.cfg
+├── docker-compose.yml
+├── flask
+│ ├── Dockerfile
+│ ├── ansible
+│ │ ├── env
+│ │ │ ├── cmdline
+│ │ │ ├── envvars
+│ │ │ ├── extravars
+│ │ │ ├── passwords
+│ │ │ ├── settings
+│ │ │ └── ssh-key
+│ │ ├── inventory
+│ │ │ └── hosts
+│ │ └── project
+│ │ └── playbook.yml
+│ ├── app.ini
+│ ├── main.py
+│ ├── run.py
+│ ├── static
+│ │ ├── app.js
+│ │ ├── bulma.min.css
+│ │ ├── highlight.min.css
+│ │ ├── highlight.min.js
+│ │ └── styles.css
+│ └── templates
+│ ├── 404.html
+│ ├── base.html
+│ ├── create_user.html
+│ └── login.html
+├── nginx
+│ ├── Dockerfile
+│ └── nginx.conf
+
+Flask DockerFile:
+FROM python:3.7.2-stretch
+WORKDIR /app
+ADD . /app
+RUN pip install --upgrade pip && pip install flask uwsgi requests ansible_runner
+CMD ["uwsgi","app.ini"]
+
","The ansible_runner Python package is just an interface to the ansible executable. You need to install Ansible itself within your Docker container. Add RUN apt-get update && apt-get install -y ansible to your Dockerfile
+FROM python:3.7.2-stretch
+
+RUN apt-get update && \
+ apt-get install -y ansible && \
+ rm -rf /var/lib/apt/lists/*
+
+WORKDIR /app
+ADD . /app
+RUN pip install --upgrade pip && pip install flask uwsgi requests ansible_runner
+CMD ["uwsgi","app.ini"]
+
",python
+"How to find the row number from a character index in python?I have a genetic dataset where the index of a row is the name of the gene. I am looking to also find the row number of any given gene so I can look at genes individually after they've gone through a machine learning model prediction - to interpret the gene's prediction in shap. How I code for the shap plot currently needs a row number to pull out the specific gene.
+My data looks like this:
+Index Feature1 Feature2 ... FeatureN
+Gene1 1 0.2 10
+Gene2 1 0.1 7
+Gene3 0 0.3 10
+
+For example if I want to pull out and view model prediction of Gene3 I do this:
+import shap
+shap.initjs()
+
+xgbr = xgboost.XGBRegressor()
+
+def shap_plot(j):
+ explainerModel = shap.TreeExplainer(xgbr)
+ shap_values_Model = explainerModel.shap_values(X_train)
+ p = shap.force_plot(explainerModel.expected_value, shap_values_Model[j], X_train.iloc[[j]],feature_names=df.columns)
+ return(p)
+
+shap_plot(3)
+
+Doing shap_plot(3) is a problem for me as I do not actually know if the gene I want is in row 3 in the shuffled training or testing data.
+Is there a way to pull out the row number from a known Gene index? Or potentially re-code my shap plot so it does accept my string indices? I have a biology background so any guidance would be appreciated.
","Try the following. df is your dataframe and result will give you the row number (first row will result 1, etc) for a given gene
+list(df.index).index('Gene3')+1
+
+#result
+
+3
+
",python
+"How do I create a ""rpg door"" effect in pygame, and what's wrong with my method?I'm creating a game where going near a door(defined using Rect function) calls a function which loads a new game screen which gives the effect of going through to that door inside the house. I tried to do the same thing with the coming out mechanism where I tried to define an area with a Rect object and made it so that when the player comes near it the main game loop is called and it would give the effect of coming out the door but doing so makes it so that the player is stuck in an infinite loop of the screen fading away. the part of the code is attached below and the full code is here
+this is the door collision detection
+ out = pygame.Rect(360, 580, 80, 10)
+ player_rect = playerImgXL.get_rect(topleft=(playerXxl, playerYxl))
+
+ if player_rect.colliderect(out):
+ game() # coming out of the house
+
+this is the game loop function
+def game():
+ global present_dialogue
+ global current_dialogue
+ global playerX
+ global playerY
+ clock.tick(12)
+ mixer.music.pause()
+ mixer.music.load('pallet_music.mp3')
+ # mixer.music.play(100)
+ playerX_change = 0
+ playerY_change = 0
+ running = True
+ while running:
+ present_dialogue = None
+ current_dialogue = None
+ for event in pygame.event.get():
+ if event.type == pygame.QUIT:
+ pygame.quit()
+ sys.exit()
+ redrawgamewindow()
+ pygame.display.update()
+
+this isn't the entire game loop there's just the event detection and collision detection between the sys.exit statement and the display update statement
+this is the redrawgamewindow function that I call at the end of the game loop
+def redrawgamewindow():
+ global walkcount
+ scr.fill((0, 0, 0))
+ scr.blit(pallet, (60, 0))
+ if current_dialogue:
+ scr.blit(*current_dialogue)
+ npc_one_dialogue()
+ if walkcount + 1 >= 29:
+ walkcount = 0
+ if up:
+ scr.blit(WalkFront[walkcount // 7], (playerX, playerY))
+ walkcount += 1
+ elif down:
+ scr.blit(WalkBack[walkcount // 7], (playerX, playerY))
+ walkcount += 1
+ elif left:
+ scr.blit(WalkLeft[walkcount // 7], (playerX, playerY))
+ walkcount += 1
+ elif right:
+ scr.blit(WalkRight[walkcount // 7], (playerX, playerY))
+ walkcount += 1
+ else:
+ player(playerX, playerY)
+ if present_dialogue:
+ scr.blit(*present_dialogue)
+ npc_two_dialogue()
+
","A few changes are needed to prevent the fade loop:
+
+In the inhouse_oak function, reset the starting position when the player enters the room
+
+When the player exits the room, just return to the main game loop
+def inhouse_oak():
+ global playerXxl
+ global playerYxl
+ playerXxl = 365 # reset starting position
+ playerYxl = 480
+ ...........
+ while running:
+ .................
+ if player_rect.colliderect(out):
+ return # return to main loop
+ # game() # coming out of the house
+
+
+In the main loop, when the player exits the room, move the player away from the door
+ door3 = pygame.Rect(462, 348, 25, 5)
+ if player_rect.colliderect(door3):
+ fade(800, 600)
+ inhouse_oak() # Oak's Lab door
+ playerY_change = playerX_change = 0 # stop player movement
+ playerY += 10 # move away from door
+
+
+
+To add a fade when exiting the room, make these changes:
+
+Generalize the fade function.
+def fade(x, y, rgw): # last parameter is screen function to call
+ fade = pygame.Surface((x, y))
+ fade.fill((0, 0, 0))
+ for alpha in range(0, 300):
+ fade.set_alpha(alpha)
+ rgw(True) # fading = True
+ scr.blit(fade, (0, 0))
+ pygame.display.update()
+
+
+Add the fading parameter to the main game function.
+def redrawgamewindow(fading=False):
+
+
+Add the fading parameter to the room function and check the parameter before updating the screen to prevent screen flash.
+def redrawgamewindow_oak(fading=False):
+ .........
+ if not fading: pygame.display.update() # prevent flash if fading
+
+
+Update the fade call in the redrawgamewindow function.
+ door3 = pygame.Rect(462, 348, 25, 5)
+ if player_rect.colliderect(door3):
+ fade(800, 600, redrawgamewindow) # fade main game
+ inhouse_oak() # Oak's Lab door
+ playerY_change = playerX_change = 0
+ playerY += 10 #playerX_change
+
+
+In redrawgamewindow_oak, when leaving the room, call the fade function.
+ if player_rect.colliderect(out):
+ fade(800, 600, redrawgamewindow_oak) # fade room
+ return # return to main loop
+ #game() # coming out of the house
+
+
+
",python
+"how do I process an excel file with hyperlink/url in pandas?I have an excel file that has one column filled with Hyperlinks, I read it using df = pd.read_excel() then filtered it and saved it to a new excel file with df.to_excel().
+The problem is that I have now lost the clickable hyperlinks, instead, there's just the text(not a hyperlink)
+Can I use pandas for this? or should I be using some other library?
","You can use the import xlsxwriter library to add hyperlinks. Speaking of hyperlinks, the example here shows some examples such as:
+worksheet.write_url('A5', 'http://www.python.org/', tip='Click here')
+
+But, if you don't want to manually write a line of code for each cell, then you can loop through and add the hyperlinks dynamically if you have a list of all the hyperlinks.
+hyperlinks = ['a.com', 'b.com', 'c.com' ... etc.]
+
+for i in range(1, len(hyperlinks)):
+ worksheet.write_url(f'A{i}', hyperlinks[i-1], tip=df['column string'][i-1])
+
+Your hyperlinks would obviously have to be in the correct order in the list, or you could create a dictionary that makes the text and hyperlink a key-value pair and use .map to bring the hyperlink into your dataframe as a column. Then you could sort the values and send the hyperlink to a list with hyperlinks = df['hyperlink'].to_list(). Then you could run the for-loop.
+But, I think you will have to create a list or dictionary first.
+
+Also, check out this answer for reading in date with hyperlinks using openpyxl:
+Pandas read_excel with Hyperlink
+And, this one for writing data with hyperlinks using pandas:
+add hyperlink to excel sheet created by pandas dataframe to_excel method
",python
+"Close PyQt Dialog without closing main programmeI am trying to use a PyQt dialog created in Qt Designer to take some user inputs (complexity rating, material type and machine type) before using these input for the reminder of my python programme. The options displayed in the dialog box are read out from a dictionary. My problem is that no matter what I have tried, closing the dialog box, through pressing the submit button, stops the remainder of my programme running, whether I keep the dialog box in the main py programme or run it as a function in a separate file. I am quite sure it is related to the sys.exit(app.exec_()) line, I have also tried using .close and .reject for closing the dialog with the same result. Also I'm aware that the programme is not great and I'm butchering the variable passing out of the function, but if you have any advice for how to get the rest of my programme talking with the dialog box I would be super grateful, I have exhausted the rest of Google on this problem, thanks so much!
+import os
+import numpy as np
+
+def get_part_info():
+ material_ops =[]
+ complex_ops = [1,2,3]
+ machine_ops = []
+
+#---Dictionary containing material options and machine options is read out here, this part works fine ----
+
+ mat_choice = 'empty'
+ comp_choice = 'empty'
+ mach_choice = 'empty'
+
+ from PyQt5 import QtCore, QtGui, QtWidgets
+ class Ui_Dialog(object):
+ def setupUi(self, Dialog):
+ Dialog.setObjectName("Dialog")
+ Dialog.resize(227, 217)
+ self.verticalLayout_3 = QtWidgets.QVBoxLayout(Dialog)
+ self.verticalLayout_3.setObjectName("verticalLayout_3")
+ self.verticalLayout = QtWidgets.QVBoxLayout()
+ self.verticalLayout.setObjectName("verticalLayout")
+ self.Material = QtWidgets.QComboBox(Dialog)
+ self.Material.setObjectName("Material")
+ self.verticalLayout.addWidget(self.Material)
+ self.Complexity = QtWidgets.QComboBox(Dialog)
+ self.Complexity.setObjectName("Complexity")
+ self.verticalLayout.addWidget(self.Complexity)
+ self.Machine = QtWidgets.QComboBox(Dialog)
+ self.Machine.setObjectName("Machine")
+ self.verticalLayout.addWidget(self.Machine)
+ self.textEdit = QtWidgets.QTextEdit(Dialog)
+ self.textEdit.setObjectName("textEdit")
+ self.verticalLayout.addWidget(self.textEdit)
+ self.verticalLayout_3.addLayout(self.verticalLayout)
+ self.Submit = QtWidgets.QPushButton(Dialog)
+ self.Submit.setMaximumSize(QtCore.QSize(100, 16777215))
+ self.Submit.setObjectName("Submit")
+ self.verticalLayout_3.addWidget(self.Submit, 0, QtCore.Qt.AlignHCenter|QtCore.Qt.AlignVCenter)
+
+#------Read out from the dictionary is added to the drop down menus here-----
+
+ for i in list(material_ops):
+ self.Material.addItem(i)
+ for i in list(complex_ops):
+ self.Complexity.addItem(str(i))
+ for i in list(machine_ops):
+ self.Machine.addItem(i)
+
+ self.Submit.pressed.connect(self.save)
+ self.retranslateUi(Dialog)
+ self.Submit.pressed.connect(Dialog.reject)
+ QtCore.QMetaObject.connectSlotsByName(Dialog)
+
+ def retranslateUi(self, Dialog):
+ _translate = QtCore.QCoreApplication.translate
+ Dialog.setWindowTitle(_translate("Dialog", "Dialog"))
+ self.Submit.setText(_translate("Dialog", "Submit"))
+
+ def save(self):
+ global mat_choice, comp_choice, mach_choice
+ mat_choice = (self.Material.currentText())
+ comp_choice = (self.Complexity.currentText())
+ mach_choice = (self.Machine.currentText())
+
+ if __name__ == "__main__":
+ import sys
+ app = QtWidgets.QApplication(sys.argv)
+ Dialog = QtWidgets.QDialog()
+ ui = Ui_Dialog()
+ ui.setupUi(Dialog)
+ Dialog.show()
+ sys.exit(app.exec_())
+ return mat_choice, comp_choice, mach_choice, matdict
+
+get_part_info()
+
+print('Rest of programme is working') # programme never gets this far
+
+#---The rest of the programme that uses these user chosen options is here and never runs due to the dialog closing stopping the whole programme ------
+
",I solved this by removing the sys.exit(app.exec_()) line and used just app.exec_() instead which successfully runs the input dialog and then the remainder of the programme using the chosen values. I can't pretend to know why it works now but it does in case anyone encounters a similar issue.
,python
+"Problems converting a python pygame into exeokay so I'm trying to convert my python pygame game into a .exe file so I can send it to friends but every time I try to convert it creates the .exe file but I can't open it doesn't mark any error and it doesn't have any extra files
+I have tried auto-py-to-exe and pyinstaller almost every possible combination
+it only uses pygame, Sys and random modules and python of curse
+here is a GitHub with the file
+I don't know if you can tell me how to convert it or just straight upconvert it for me
","Get py2exe. It is easy to use. Assuming your main script is game.py and that script includes all the other stuff, all you need to do is prepare yet another script, let us call it setup.py and type in:
+
+from distutils.core import setup import py2exe
+setup(console=['game.py'])
+
+Now execute that file and it compiles everything into an executable:
+
+python setup.py py2exe
+
",python
+"Creating new column from filtering othersI need to assign to a new column the value 1 or 0 depending on what other columns have.
+I have around 30 columns with binary values (1 or 0), but also other variables with numeric, continuous, values (e.g. 200). I would like to avoid the write a logical condition with many OR, so I was wondering if there is an easy and fast way to do it.
+For example, creating a list with name of columns and assign 1 to the new column if there is at least a value 1 across all the columns for that corresponding row.
+Example:
+a1 b1 d4 ....
+1 0 1
+0 0 1
+0 0 0
+...
+
+Expected:
+a1 b1 d4 .... New
+1 0 1 1
+0 0 1 1
+0 0 0 0
+...
+
+Many thanks for your help
","Here is a simple solution:
+df = pd.DataFrame({'a1':[1,0,0,1], 'b1':[0,0,0,1], 'd4':[1,1,0,0], 'num':[12,-2,0,3]})
+df['New'] = df[['a1','b1','d4']].any(1).astype('int')
+df
+
+ a1 b1 d4 num New
+0 1 0 1 12 1
+1 0 0 1 -2 1
+2 0 0 0 0 0
+3 1 1 0 3 1
+
",python
+"I want to display all the data entered by user in view_project.html page in djangoviews.py
+@login_required(login_url="/accounts/login/")
+def add_project(request):
+ if request.method == 'POST':
+ form = forms.CreateProject(request.POST, request.FILES)
+ if form.is_valid():
+ # save in db
+ instance = form.save(commit=False)
+ instance.candidate = request.user
+ instance.save()
+ return redirect ('view_project')
+ else :
+ form = forms.CreateProject()
+ return render(request, 'home/add_project.html', {'form': form})
+
+@login_required(login_url="/accounts/login/")
+def view_project(request):
+ return render(request, 'home/view_project.html')
+
+models.py
+from django.db import models
+from django.contrib.auth.models import User
+
+# Create your models here.
+class project(models.Model):
+ Name_of_the_organisation_or_Individual_applying = models.CharField(max_length=200)
+ Name_of_the_Project = models.CharField(max_length=200)
+ Name_of_the_Principal_Investigator = models.CharField(max_length=200)
+ date = models.DateTimeField(auto_now_add = True)
+ Cover_Letter = models.FileField(upload_to=None, max_length=254)
+ Summary_of_Project = models.CharField(max_length=500)
+ Study_Proposal = models.CharField(max_length=1000)
+ Any_other_documents_required = models.FileField(upload_to=None, max_length=254)
+ candidate = models.ForeignKey(User, default=None, on_delete=models.CASCADE)
+
+This is the views.py file where the project is added into the database for a particular user and the HTML file used for this is add_project.html
","To display all the project, you have to pass them in your view using Django's context. Also, you need to use the query method on the models which you want to retrieve all of it's data. look into the code below to get a better idea.
+@login_required(login_url="/accounts/login/")
+def view_project(request):
+ projects = project.objects.all()
+ context = {"projects":projects}
+ return render(request, 'home/view_project.html', context)
+
+Then in your html file, you can then loop through all the "projects" context like this:
+{% for project in projects %}
+<h1>{{project.Name_of_the_organisation_or_Individual_applying}} </h1>
+<h1>{{project.Name_of_the_Project}} </h1>
+# and so on for all other fields
+{% endfor %}
+
",python
+"Python Dataframe get previous row Open High Low Close Volume Dividends Stock Splits
+Date
+2020-07-31 324.60 325.33 320.05 325.22 85210800 0.0 0
+2020-08-03 327.01 328.31 326.42 327.48 53077900 0.0 0
+2020-08-04 326.55 328.74 326.55 328.74 41917900 0.0 0
+
+How do I write the following in code:
+
+If current row volume is more than previous row volume then add new
+column titled position and subtract close - open of the current row ?
+
","Not the most elegant but should work:
+df['Position'] = np.nan
+df.loc[df['Volume']>df.shift(periods=1)['Volume'],'Position'] = df.loc[df['Volume']>df.shift(periods=1)['Volume'],'Close']-df.loc[df['Volume']>df.shift(periods=1)['Volume'],'Open']
+
+If you are ok with 0.00 for positions where the volume condition is not satisfied (rather than NaN), a simpler version will work:
+df['Position'] = (df.shift(periods=1)['Volume'] <df['Volume']) *(df['Close']-df['Open'])
+
",python
+"Relative import between modulesI've written a group of functions that I wanted to use for my computation and I've organize them in some .py file, say functions1.py and functions2.py. Within the same folder I have also another file main.py, then:
+root\
+- functions1.py
+- functions2.py
+- main.py
+
+
+Inside functions1.py suppose I have the following code:
+import numpy as np
+
+def mycos(x):
+ return np.cos(x)
+
+def mysin(x):
+ return np.sin(x)
+
+While inside functions2.py:
+from .functions1 import mysin, mycos
+
+def mytan(x):
+ return mysin(x)/mycos(x)
+
+
+Now suppose that main.py contain:
+import numpy as np
+from .functions2 import mytan
+
+angle = np.pi/3
+if mytan(angle) == np.tan(angle):
+ print('OK')
+
+Then, If I execute main.py I got the following error:
+Traceback (most recent call last):
+ File "functions2.py", line 6, in <module>
+ from .functions1 import mysin, mycos
+ImportError: attempted relative import with no known parent package
+
+Did I miss something in the use of relative import?
","i think this is because when you run the code you are not executing in the project directory, try adding these few lines at the beginning of the code of main.py
+import os # don't add this line if you have already imported os
+os.chdir(os.path.dirname(os.path.abspath(__file__)))
+
+let me know if this works, if it doesn't... can you share the full code to understand better your problem?
+
+EDIT
+the problem should be that the file functions1.py and functions2.py are not actually modules, to fix this i am gonna suggest you two solutions:
+Create a folder
+as the two function file are not taken as a module, create a folder and put them inside it like so
+MAINDIR
+– main.py
+– functions
+–– functions1.py
+–– functions2.py
+
+the code is this
+main.py
+import numpy as np
+from functions.functions2 import mytan
+
+angle = np.pi/3
+if mytan(angle) == np.tan(angle):
+ print('OK')
+
+functions1.py
+import numpy as np
+
+def mycos(x):
+ return np.cos(x)
+
+def mysin(x):
+ return np.sin(x)
+
+functions2.py
+from .functions1 import mysin, mycos
+
+def mytan(x):
+ return mysin(x)/mycos(x)
+
+
+Use __init__.py
+MAINDIR
+– main.py
+– functions
+–– __init__.py
+–– functions1.py
+–– functions2.py
+
+and this is the code
+init.py
+import functions1, functions2
+
+(and all other files just like the first solution)
+note that this solution is a bit longer, but this should be the most "correct" way to do this
+And then?
+if you wanna understand better why it wasn't working i would like to suggest you to read this well done articles:
+So this code is much less than a widget; it won't ever be displayed on its own. It needs to have the rgl code loaded first, and then it will add some rglwidgetClass methods.
+How should this be done? Is there a way to make an invisible widget, that just exists so that dependencies can be declared?
","I don't know if this is the best way, but here is one way to do what I want.
+How can I display the list that in each line I will have one li and not 2 li in one row?
+I am trying to make a navigation bar with dropdowns when you hover. I got this from w3schools.com, but I wanted to have multiple drop downs next to each other. I have 2 of them next to each other, but when I hover over either of them, it shows the same dropdown menu. How do I fix this? Sorry if this seems obvious, I'm a beginner.
+You had a simple error with your pairing your div tags.
+After I was done working on side bar I wanted to head to the content page to the right so I added a container div that I wanted to set to flex So that I make the two boxes to start styling, but right after adding it, the footer's position is ruined and the side bar isn't taking as much space as it should.
+Add width attribute to task-contianer and add your fotter class attribute on your CSS file.
+I got this from somewhere but when I tap on it on the phone (on PC it's ok) it stays with hover attributes = no BG and black colour to the next website refresh:
+What is wrong there? Or what I did wrong?
+×I don't know what to add more here, it won't let me post it because of mostly code.×
+Thank you.
","If I'm right then you're trying to make an icon to go to the Top.
+Here, I'm sharing a simple code for that, you'll understand easily.
+The idea is to display .html file in html.iFrame in Plotly Dash hosted in Kubernetes.
+The code works as intended in local however it's met with 404 ( in K8s log ) when trying to fetch the file in the pod (/project/assets/). I have went into the pod to check and the files are there. I have checked for file permissions and they are the same as local (-rw-r--r--), parents folders have the same permissions too. I have some other files in /project/cache folders that can be successfully displayed through dcc.Graph indicating that io is fine.
+Is there something unique about html.iFrame in Kubernetes that I have missed out? My apologies if the question is too terse but I am not even sure where or how to start debugging.
+Solved using srcDoc instead of src.
+I am struggling to position the partial border directly above the title without the title floating to the left of the 'before' pseudo element. It works fine on pages without an image but I need the title and text to wrap around the image on some pages where an image exists but not on others and this is causing the issue.
+I have tried absolute positioning, different displays but nothing works.
+Because of diferent sizes of scrollbar I'm getting a problem while using a fixed component in both mobile and desktop:
+While the desktop version get correctly aligned with the components below, the mobile version doesn't.
+If I fix the width for the mobile the desktop will go over the component limit and it just doesn't work.
+My css for it is as follow, the commented width is the one that works for the mobile.
+Anyone have any ideas?
+I want the css codes of the blog1popup class to change when the image is clicked.
+I know how to do this with hover, but i don't know how to make this action happen by clicking.
+
+The element I want to use as this button;
+You can add click event on blog-item class, then you can add classes to any element or change any css property using jquery methods.
+Eg.
+we have a problem (we are a group).
+We have to use jsoup in java for an university project. We can parse Htmls with it. But the problem is that we have to parse an html which updates when you click on a button (https://www.bundestag.de/services/opendata).
+We want to access all xmls from "Wahlperiode 20". But when you click on the slide buttons the html code updates but the html url stays the same. But you have never access to all xmls in the html because the html is updating over the slide button.
+Another idea was to find out how the urls of the xmls we want to access are built so that we dont have to deal with the slide buttons and only access the xml urls. But they are all built different.
+So we are all desperate how to go on. I hope y'all can help us :)
","The problem is that websites aren't static resources; they have javascript, and that javascript can fetch more data in response to e.g. the user clicking a 'next page' button.
+What you're doing is called 'scraping': Using automated tools to attempt to query for data via a communication channel (namely: This website) which is definitely not meant for that. This website is not meant to be read with software. It's meant to be read with eyeballs. If someone decides to change the design of this page and you did have a working scraper, it would then fail after the design update, for example.
+This data is surely open, and open data tends to come with APIs; things meant to be queried by software and not by eyeballs. Go look for it, and call the german government, I'm sure they'll help you out! If they've really embraced the REST principles of design, then send an accept header that including e.g. application/json and application/xml and does not include text/html and see if the site just responds with the data in JSON or XML format.
+In just about every browser there's 'dev tools'. For example, in Vivaldi, it's under the "Tools" menu and is called "Developer tools". You can also usually right click anywhere on a web page and there will be an option for 'Inspect', 'Inspector', or 'Development Tools'. Open that now, and find the 'network' tab. When you (re)load this page, you'll see all the resources its loading in (so, images, the HTML itself, CSS, the works). Look through it, find the interesting stuff. In this specific case, the loading of wahlperioden.json is of particular interest.
+That sounds useful, and as its JSON you can just read this stuff with a json parser. No need to use JSoup (JSoup is great as a library, but it's a library that you can use when all other options have failed, and any code written with JSoup is fragile and complicated simply because scraping sites is fragile and complicated).
+Then, click on the buttons that 'load new data' and check if network traffic ensues. And so it does, when you do so, you notice a call going out. And so it is! I'm seeing this URL being loaded:
+The format is rather obvious. offset=10 means: Start from the 10th element (as I just clicked 'next page') and limit=10 means: NO more than 10 pages.
+Javascript can also generate entire HTML on its own; not something jsoup can ever do for you: The only way to obtain such HTML is to actually let the javascript do its work, which means you need an entire browser. Tools like selenium will start a real browser but let you use JSoup-like constructs to retrieve information from the page (instead of what browsers usually do, which is to transmit the rendered data to your eyeballs). This tends to always work, but is incredibly complicated and quite slow (you're running an entire browser and really rendering the site, even if you can't see it - that's happening under the hood!).
+Selenium isn't meant as a scraping tool; it's meant as a front-end testing tool. But you can use it to scrape stuff, and will have to if its generated HTML. Fortunately, you're lucky here.
+[1] I'm using the definition of: Using a tool or site to accomplish something it was obviously not designed for. The sense of 'I bought half an ikea cupboard and half of an ikea bookshelf that are completely unrelated, and put them together anyway, look at how awesome this thingie is' - that sense of 'hack'. Not the sense of 'illegal'.
",html
+"How to center the content of (centered) pure css columnsThe snippet presents a series of columns containing a single letter. I'd like to have the center of the glyph in the center of each, but as you can see, they not quite centered -- appearing just off to the right. (I suspect by half the glyph width, but I'm not sure).
+I have struggled to find any clarification on this in the WCAG docs I have read.
","I'm making a website where I want to have text written in the middle of a rectangle. The text will be a number from 1 to 100. For some reason ctx.fillText() isn't doing anything. My canvas isn't small. It's as big as the image that's being drawn on it.
+If we remove those textWidth & textHeight your drawSelectionRect function works.
+I believe what you are trying to accomplish is to align the text, perhaps with these, you can do it:
+I need the user to type a word and if its the right keyword, I need it to change pages.
+for example I want something a bit like this.
+In the solution below, the user input is evaluated to enable/disable the disabled attribute of the submit button.
+I am trying to get the text only from the h2 and the first p tag. I've been using class name to find the div and the output gives me all of the text in the div (obviously).
+I assume the answer is to use xpath but I can't figure out how to include both tags. Do I need to use two separate lines of code to do it or can I combine both into one?
","I solved it using css selectors, but didn't combine them into one. Another commenter's answer using xpath and class name combining the two is a possible solution.
+What is the best way to approach structuring the content? Do I just simply wrap the parenthetical aside above in a <span> tag and style it differently like this?
+I need some help when hovering over article , i want to cover the article below it instead of pushing the whole next row, i tried to give it some padding and then heigh but i'm getting the same result. I tried also with Jquery to use .hover() function and it gave the same result.
+The whole container has a display flex , and every article has a width and height , and when hovering over the article it gets more height. The article without hovering has a 480px and then when hovering 500px
+How can I integrate in html pages 2 different CSS styles for 2 diferent images, both being separated vertically by a space?
+See the paintings below. They have different sizes, different styles.
+Something like this. You may change the dimensions anytime
+I am quite a beginner and trying to learn JavaScript. I am trying to improve an example for Popup.
+I want to adjust the left margin of the popup so I can center the element to the text.
+I added the CSS for reference but I am trying to do it in JS.
+Basically the problem is if I put the code as below it works.
+But if I define it with variable and put it like below, it won't work.
+I don't know if I am missing anything. I added the all code below for reference.
+In your example (see below) you tried to set the margin to be equal to a TextNode, which is not a valid value.
+You are looking for something like this.
+So I am trying to create a logo and a menu icon in the header but for some reason, they are always overflowing the height of the header which I have strictly specified! Why is that ?
+For example, I tried to create a hamburger icon but I could not because of this overflow issue. The menu lines were working as if the entire element is shown but I had to hide it out so that it could fit into the header.
+Basically you are forcing your elements to be higher than the header itself by giving them static heights (height 100px on the menu and padding-top/bottom 30px on the logo)
+Using height 100%, so the elements adapt to the header.
+Let me know if this solves your problem. If not, let me know in more detail what you're trying to accomplish.
+Upon loading this page it uses google to nicely show the rows in ColumnChart.
+The html would show barchart.
+I want to create a javascript button that will replace text in the html from ColumnChart to BarChart and reload the page, something like:
+But this reload the page with the initial values. If I don't add reload I don't see any difference of course.
+I would like to put inputs label on border cross section and would like to hide borders under label like this
+
+I did this by simply making label background white but it obviously doesn't work if I place it inside grey div any other solution for this?
+So basically im trying to inject information that comes from my database (im using postgresSQL and node.js) into the webpage and i have confirmed that im getting the information i want right before trying to innerHTML however as soon as it goes to that codeline somehow it can't detect the value and inject undefined (i think) and im not sure why it's doing that.
+I find it really weird since i have done it similarly with another page and it works fine.
+I don't know why fills that field with undefined since when i stringify what the ajax return it shows that it has exactly what i need, one of those fields being the cour_name row.
+I'd really appreciate if someone could help me with this problem and thank you for your time.
","if using chrome browser , try devtool debugger.
+open the DevTool (e.g. press F12) then click on sources tab and locate your javascript file and put a breakpoint on the line this not working as expected. then reload the page and check the value of variables and executes the rest of the code line by line.
+I am currently adjusting my navbar and I want to remove the functionality of one anchor in my ul while still matching the style of the other anchors.
+I already tried to use an id on the specific li-element and replacing it with a button (just copy-pasting the css), but nothing works so far.
+I am looking for the method that allows to modify a value/text on my home page with the used link.
+without changing anything in my code.
+This is possible with HTML, but you need JavaScript. Here's an example:
+Note that this won't work here, but it will work in the website.
+As @JNa0 said, PHP is better suited to this task. The PHP would look like echo $_GET["name"];
",html
+"how can i remove underline in linki am working on a project in which i have to make a webpage .i have some links for going web sites and i want to remove underline in link ( a ).
+please help me.
+It is a nice effect and I want to use it but when I am doing so it is placed over my website's navbar and thus the links are not clickable
+I've searched and searched for what I thought would be a simple find but haven't found anything. I have this form:
+I want it indented so it doesn't touch the side of the area it's in, I've tried something like <div style="text-indent-left:20px"> wrapped around it but didn't phase it.
+Thanks.
","I just begin to learn Three js and I face to a problem when tryng to add a plane geometry based on a img tag on the dom.
+As you can see on the screenshots, the plane geometry has not the exact size of my aspect div (in red) :
+For the moment I manually set the z axis of my camera, but I know it's not the good solution and just wanted to know if there was an existing solution to acheive this ?
+I have two div and I want to apply some css on the first div that has no class or id .How can I do it ??
+F.e. in my attached code, submitting works when no value is entered, but it fails if the length of the entered value is less than fife.
+Why does it behave like this?
+In my opinion the browser should prevent a submit when the minlength requirement is not met.
+It is nature of minlength.
+The content in the right div will be added dynamically, in this case we have red boxes. So it may contain 0 or more of those red boxes.
+I have used overflow-x: auto; which helps to generate a horizontal scroll bar. Hope this works fine.
",html
+"Cannot use addEventListener() on a createElement() (a button) after displaying it in the DOM (in javascript)I am creating a Javascript framework recently and I am facing two problems that I don't understand at all. When I create a new button with createElement() in Javascript, I can't place it inside several elements at the same time without killing the "click" event (if this is an id, it works).
+Create the button inside the loop instead, and both problems are solved.
+I have been trying to change the delimiter of HtmlWebpackPlugin ejs since I need to be able to output <%.
+Any ideas what I can do to deal with this issue? I just need to be able to output a aspx file which has "<%-- " and "<%@" on it. the rest I can handle with the sample file they have to put the assets in the right place.
","I figured it out. You need to do the following:
+How can I use localStorage to store the data of a form. Here is the code:
+But I don't know how to grab the css of the whole thing. For instance, if half the selection is Arial font, and the other is Serif. I want it to pick the first. But I don't even know how to start with getting the CSS. Here is what I have so far:
+I ran it on this page (via console) and I saw over 400 css properties listed.
+I have some problems to get the content out of the django database.
+I'm a bloody beginner at django so I did some tutorials. They've worked fine. Now I want to reproduce them in my own project with own variables so I can see if I've understood all the stuff.
+This works because I could add content to the database.
+So there is no mistake.
+I guess it's right. But the content of author doesn't showup, so what did I miss?
+I can call static files via django, so couldn't be something wrong with django...
+Can anyone help.
+Thanks in advance.
","I think that you need to use a for loop to show "aboutme" because with "aboutme = Aboutme.objects.all()" you query a list of elements.
+I've got a winform that contains a webview.
+At some points the webview will display html content including input boxes for the end user to throw some responses into.
+Is it possible for me to interact with the input box in the webview to grab its contents programatically?
+Even if its just to display that text in a messagebox as an example, that will get me well on my way!
+so in the example form below, I'd be looking to grab whatever the end user entered into 'userresponse' and throw it into a variable. like...
+I have a span tag inside of a paragraph that updates every second, and whenever it does, it keeps on shifting the text either to the right or left:
+I can use media queries, but is there another CSS solution? Can I combine these calc function with min() max() or clamp() to avoid negative values?
+i found a possible solution. It works with max() in this way
+for all who are interested. I ended up with this css rules.
+How can I make it so I have one image on the left, with a paragraph on the right. Then another image on the right, with a paragraph on the left.
+As the title states I want to get the first instance of a value that is given to the attribute in a data-*
+Let's say I already have a list of all the unique values found in data-group. How would I be able to specify the divs that have the first instance of "1", "2", and "3".
+The goal of this is to be able to prepend something before the first instance of each of the unique data-group using jQuery
","Hello, guys! I am playing with this structure and trying to link each sections to its corresponding NAV option so I can display only the content selected (as a different page). However with the structure of an "id" attribute in the section and "href" attribute to the NAV option doesn't help too much, because only switches the content on the page, not the page itself. I've tried my best to do this, but this is the best I could do.
","I'm using the plyr library to upload a video. The video URL comes from a server, however the video is not loaded and I get the following error in chrome: Not allowed to load local resource.
+If you look at the error message you are getting above you will see that it is trying to open the video from a URL starting with:
+The browser is interpreting that this file is a local file and not on a server - i.e. even though the 'video URL comes from a server' as you say the URL which the browser is seeing is a local URL.
+If you want the source to be a video on a server your HTML code you should either see an absolute server source URL like this example:
+Or a URL which is relative to your own web server serving the current page like this:
+If, in your example, 'MyServer' is a valid 'authority name, e.g. 'MyServerDomainName.com:8080', then you URL is indeed a valid URL according to the RFC. It is referred to as a network-path reference URI and described in section 4.2 of the RFC linked below.
+Different browsers may treat this differently, and Chrome does seem to default to a local file when it sees a URI starting with a double forward slash if you enter it directly in the URL tab of the browser.
+However, with a valid 'authority' in the URL it should resolve to a video - see the example snippet below which is tested on Chrome and Safari and plays the video:
+If this is not working in your case and defaulting to a local file, it seems most likely that URL does not contain a valid 'authority' part. However, in this case it should give a name not resolved case on Chrome, as in the snippet below:
+It is worth noting that this is a very specific URL format and it may be that you don't actually need to use this format. It is usually used so that the request uses either HTTP or HTTPS in line with the current page. See discussion here and the definition in the RFC also:
+